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Open and Closed Models for Networks of 
Queues 


By W. WHITT* 
(Manuscript received April 9, 1984) 


This paper investigates the relationship between open and closed models 
for networks of queues. In open models, jobs enter the network from outside, 
receive service at one or more service centers, and then depart. In closed 
models, jobs neither enter nor leave the network; instead, a fixed number of 
jobs circulate within the network. Open models are analytically more tractable, 
but closed models often seem more realistic. Hence, this paper investigates 
ways to use open models to approximate closed models. One approach is to 
use open models with specified expected equilibrium populations. This fixed- 
population-mean method is especially effective for approximately solving large 
closed models, where “large” may mean many nodes or many jobs. The success 
of these approximations is partly explained by limit theorems: Under appro- 
priate conditions, the fixed-population-mean method is asymptotically correct. 
In some cases the open-model methods also yield bounds for the performance 
measures in the closed models. 


l. INTRODUCTION AND SUMMARY 


Queueing network models are now widely used to analyze commu- 
nication, computing, and production systems. A relatively well-devel- 
oped theory exists for the Markov Jackson network models and various 
extensions that have a product-form equilibrium (steady-state) distri- 
bution.’ In this paper we consider both the product-form models for 
which exact solutions are possible and more complicated nonproduct- 
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form models for which approximations are needed. We are primarily 
motivated by the desire to develop new approximations for non- 
product-form closed models. We discuss methods for modifying the 
Queueing Network Analyzer (QNA) software package’ so that it can 
be used to calculate approximate congestion measures for closed 
models as well as open models. Our general approach to the closed 
models is to apply previous techniques for open models. Hence, we 
investigate the relationship between open and closed models. 

For simplicity, we first consider a Markov Jackson network model 
with First-Come First-Served (FCFS) multiserver nodes (service cen- 
ters) and one job class. Flow within the network is determined by 
stochastic routing probabilities: Each job completing service at node 1 
goes immediately to node j with probability q,, independent of the 
history of the system. The individual service rates and external arrival 
rates, if any, are independent of the state. The service-time distribu- 
tions are exponential and the external arrival processes, if any, are 
Poisson. It will be clear that the ideas generalize. 


1.1 Open and closed models 

These models can be classified as open or closed. In an open model, 
jobs enter the network at random from outside at a fixed rate, receive 
service at one or more nodes, and eventually leave the network. Thus, 
with an open model the total external arrival rate or throughput is an 
independent variable (specified as part of the model data), and the 
number of jobs in the system is a dependent variable (whose equilib- 
rium distribution is described in the model solution). On the other 
hand, in a closed model there is a fixed population of jobs in the 
network. Hence, with a closed model the number of jobs in the system 
is an independent variable (specified as part of the model data), and 
the throughput (which may be defined, for example, as the departure 
rate from some designated node) is a dependent variable (to be 
calculated and described in the model solution). Since the individual 
service rate is part of the model data, knowing the throughput is 
equivalent to knowing the utilization, which is the expected proportion 
of the servers at the designated node that are busy in equilibrium. 

Of course, there also are more complicated models, in which the 
simple dichotomy above is not valid. For example, Jackson introduced 
models in which the external arrival rate can depend on the total 
number of jobs in the network.’ Then neither the external arrival rate 
nor the network population is fixed. There are also mixed models, 
which have some classes with fixed populations and other classes with 
fixed external arrival rates.*® We will not consider these more general 
models, but we note that open models can be used to approximate 
mixed models in the same way that they can be used to approximate 
closed models. 
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It might seem that open models would be more appropriate for most 
applications because jobs do usually come from outside, flow through 
the system, and eventually depart. However, closed models are often 
used instead. The representation of flow through the system, i.e., the 
throughput, is easily handled in a closed model by assuming that a 
new job enters the system to replace an old one whenever the old one 
has received all of its required service. This can be represented in the 
closed model by a transition to a designated exit-entry node. At this 
node, arriving jobs complete all of their required service, and depar- 
tures are new jobs. The rate of transitions through this node (which 
is both the arrival rate and the departure rate) can be regarded as the 
throughput. If no such exit-entry node exists originally, it is easy to 
add such a node. The modified network with the additional node is 
equivalent to the original network if all jobs at this new exit-entry 
node have zero service time. 

Evidently, closed models are often applied because it seems natural 
to regard the number of jobs in the system as the independent variable 
and the throughput as the dependent variable. The number of jobs in 
the system is often subject to control; the queueing analysis is desired 
to determine the associated throughputs and response times. For 
example, in production systems, new jobs usually do not arrive at 
random; they are scheduled. In fact, this view was the main reason 
that Jackson extended queueing network theory to cover closed 
models:? 


This extension of the author’s earlier work is motivated by the observation that 
real production systems are usually subject to influences which make for increased 
stability by tending, as the amount of work-in-process grows, to reduce the rate 
at which new work is injected or to increase the rate at which processing takes 
place. 


Similarly, in computing systems the total number of jobs in device 
queues tends to be limited by resource constraints, so that it is natural 
to specify the number of jobs (the multiprogramming level) as a 
decision variable and then calculate the associated throughput (see p. 
116 of Ref. 6). Also, in time-sharing systems the number of jobs is 
limited by the number of sources (terminals), so that the total number 
of jobs is not unbounded (see p. 60 of Ref. 6). Hence, even though 
closed models are significantly more difficult to analyze because of the 
normalization constant or partition function, there are good reasons 
for applying them. 


1.2 The fixed-population-mean method 


In this paper we propose and investigate a different approach that 
may sometimes be an attractive alternative. We propose using the 
open model with specified expected equilibrium population, which we 
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refer to as the Fixed-Population-Mean (FPM) method. With the FPM 
method, we have the analytically more elementary open model, but 
the number of jobs (or, more precisely, its mean) becomes an inde- 
pendent variable and the throughput or total arrival rate becomes a 
dependent variable. Even though some of the initial modeling assump- 
tions may not seem appropriate (e.g., unlimited population and Pois- 
son arrivals), we believe that the approach has potential. With regard 
to the modeling assumptions, it is important to remember that the 
model solutions are describing only the equilibrium or, equivalently, 
long-run averages. Moreover, the closed model assumptions are often 
not entirely appropriate either. In many situations where closed 
models are applied, the total population is not nearly fixed. The FPM 
equilibrium solution may better describe these systems. Moreover, we 
can modify the FPM method in various ways to obtain a better 
description. 

Even when a closed model is deemed appropriate, the open model 
with the FPM method can be useful because if often provides a 
convenient approximation for the more difficult closed model. Cer- 
tainly the required computation is significantly reduced. In some cases, 
throughputs can be calculated by hand by the FPM method when 
some computer codes for closed models are unable to obtain any 
solution. Moreover, in many cases the results are very close. 

The FPM method also forms the basis for one procedure to calculate 
approximate congestion measures for closed non-Markov networks 
containing multiserver FCFS nodes with nonexponential service-time 
distributions, using previously developed approximation procedures 
for open non-Markov networks such as the Queueing Network Ana- 
lyzer (QNA).’ In fact, the primary motivation for this work was the 
desire to modify QNA so that it can analyze closed models as well as 
open models. The FPM method is one way to do this. Several possible 
approaches for calculating approximate congestion measures for non- 
Markov closed networks are described in Section X. 

For the basic Jackson network we are now considering, we imple- 
ment the FPM method by identifying the external arrival rate that 
yields the specified expected equilibrium number of jobs in the system. 
The standard application would be to a system that was previously 
analyzed by a closed model. Consider such a closed model with a 
designated exit-entry node. In the closed model, an arrival to this node 
from elsewhere in the network completes its required service, and a 
departure represents a new job entering the system. To obtain the 
associated open model, cut the flow into this exit-entry node, let all 
internal arrivals into this node leave the system, and insert an external 
Poisson arrival process. The FPM throughput is the rate or intensity 
of the external Poisson arrival process for which the expected equilib- 
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rium population has the specified value. The approximate mean num- 
ber of jobs at each node is the mean number in the open model with 
the FPM arrival rate. 

In fact, in the FPM procedure described above, it is not necessary 
to have a special exit-entry node. Any node can serve as the exit-entry 
node. For the Jackson product-form network we are considering, the 
equilibrium distribution of the resulting open model is independent of 
the node chosen. Choosing the exit-entry node can be important, 
however, if we do not have a product-form network. Then it may also 
be appropriate to let the new external arrival process be something 
other than a Poisson process. 

In this introductory section we give a few elementary examples to 
illustrate the FPM method. However, the primary motivation is the 
need for approximate methods to analyze more complicated models, 
e.g., with multiple job classes, nonexponential FCFS servers, priorities, 
etc. It should be clear that the FPM method is a general approach 
that applies to these more complicated models. We believe that the 
performance of the FPM method for Jackson networks indicates the 
performance that can be expected for more complicated models. 


Example 1. Consider a closed Markov cyclic network of single-server 
FCFS queues with K jobs of a single class. Let there be n, nodes 
having mean service time 1 and nz nodes having mean service time 7, 
arranged in any order. As usual, cyclic means that all departures from 
node j go next to node) + 1 for 1 <j < n, + nz — 1 and all departures 
from node n; + nz go next to node 1. To apply the FPM method, cut 
the flow into one node, let all original arrivals on that arc leave the 
system, and insert an external Poisson arrival process. We identify 
the external arrival rate in the associated open model, say A, such that 
the expected equilibrium total population is K. Since we have a cyclic 
network, the arrival rate at each node is the external arrival rate. 
(Otherwise, we would have to solve the traffic rate equations.) Recall- 
ing that the equilibrium distribution in the open model is equivalent 
to independent M/M/1 queues, we solve 


mA NoAT 2: 
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for \, which is a quadratic equation. (If there were m different service 
rates, then we would have a polynomial of degree m.) 

To illustrate, if K = 20, n; = no = 10 and 7 = 1.2, then the ap- 
proximate throughput is = 0.45. In Section IV we prove that this is a 
lower bound for the throughput in the original closed model. In (15) we 
suggest as a possible improvement \(n; + no + K)/(m + ne + K — 1), 
which in this case is 0.46. The actual throughput in the original closed 
network also turns out to be 0.46. This is easily determined using any 
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software package for closed Markovian networks of queues; we used 
PANACEA? 

Since the mean service times at different nodes do not differ much, 
we could also use a quicker approximation, based on a linear equation 
instead of the quadratic equation, obtained by assuming that all n, + 
ng nodes have mean service time 7 = (n; + Ner)/(ni + ne), and then 
applying the FPM method, which yields the almost instantaneous 
approximation K7/(K + n, + ng) = 20/44 = 0.45 for the throughput. 

This last balanced network approximation can also be used directly 
in the closed network, which corresponds to combining the last two 
suggested improvements. (See Section III.) The resulting approximate 
throughput by this method is Kr/(K + n, + nz — 1) = 0.47. This 
direct balanced network approximation for closed networks is in fact 
an upper bound on the throughput in the closed model, as was first 
shown by Zahorjan et al.° (also see Refs. 10 and 11). 

If we use \ = 0.45 as the approximate throughput by the FPM 
method, then with the M/M/1 formula the mean number of jobs at 
each node with mean service time 1 (1.2) is 0.82 (1.18); for the closed 
model, it is 0.83 (1.17). The expected sojourn time at each of these 
nodes by the FPM method is 1.82 (2.62); for the closed model, it is 
1.79 (2.53). For practical purposes, the standard congestion measures 
calculated by the FPM method agree with those for the closed model 
in this example. 

Note that the FPM solution does not change if we multiply the 
population and the number of nodes of each type by a common 
constant. It turns out that the quality of the approximation improves 
as the network grows in this way. On the one hand, this means that 
the FPM method does not perform well when there are few nodes, 
e.g., when n; = nz = 1 here. On the other hand, the FPM method tends 
to perform well for the large models that are more difficult for closed 
network algorithms. In fact, in Section V we prove that the FPM 
method is asymptotically correct for such growing closed networks. 
This asymptotic property of large closed networks was apparently first 
observed by Gordon and Newell.” We contribute by providing a rig- 
orous proof based on the local central limit theorem for sums of 
independent and identically distributed random vectors.” Also, we 
stress the significance of the FPM method in this asymptotic analysis. 
Algorithms for closed models have difficulty as the number of nodes 
increases. Evidently, no existing closed-network algorithm is able to 
handle the case of 200 nodes and 200 customers for this numerical 
example. With the aid of new asymptotic theory,’*’* PANACEA? is 
able to solve much larger networks, but the asymptotic theory does 
not apply to this example because it requires a decoupling infinite- 
server node (see Section 1.4). 
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In general, when we apply the FPM method, we do not get a 
quadratic equation. However, the expected equilibrium population in 
the newly created open network is an increasing function of the 
external arrival rate, so that it is not difficult to identify the external 
arrival rate which yields the desired fixed population mean by a search 
procedure. In fact, it is usually possible to quickly obtain the desired 
throughput with a programmable hand calculator that can find the 
roots of an equation. (At the expense of some added complexity, this 
same general approach can be used for multiple job classes. We give 
an efficient iterative algorithm for special cases involving infinite- 
server nodes in Section VI.) 

However, it is usually not necessary to carry out such a special 
inversion procedure. As is standard for closed models, we usually want 
to determine the throughput as a function of the (expected) network 
population. Hence, we simply solve the open model for a range of 
possible external arrival rates and express the expected equilibrium 
population as a function of the external arrival rate. It is then easy to 
invert the function if desired. Moreover, we can also describe the 
resulting population variability in the open model as a function of the 
external arrival rate. Thus, the FPM method consists of little more 
than using open models in situations where closed models were used 
before. Our object, then, is to better understand the relationships 
between these two kinds of models. We propose some algorithms and 
obtain some insight about when they will work well and when they 
will not. 

When both closed and open models are available, the appropriate 
model might be chosen according to which better describes the popu- 
lation variability. We suggest using estimates of the population vari- 
ance to help identify an appropriate model. It turns out that the 
population variability in the open model is often less than might be 
expected (Section II), so that the two models are often remarkably 
similar. For larger networks (large population or many nodes), the 
differences are often small relative to the quality of data typically 
available for modeling fitting. When this is the case, the open model 
is usually preferred because it is much easier to analyze. 

The possible advantage of closed models over open models is also 
reduced if we do not restrict attention to Markov open models. For 
example, if we use QNA to approximately analyze a non-Markov open 
model, then we have an additional degree of freedom in modeling the 
variability because we can select variability parameters for each serv- 
ice-time distribution and each arrival process. If the actual arrivals 
are scheduled, as in many production systems, then it is natural to use 
clocked arrivals in QNA, i.e., deterministic interarrival times, which 
is achieved by setting the variability parameter for the external arrival 
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process equal to zero. With an open model, we are not forced to have 
a Poisson external arrival process. From direct modeling considera- 
tions, the open model with clocked external arrival processes is often 
more realistic than the closed model. 

Furthermore, given that we are actually interested in a closed model, 
the variability parameters offer the possibility of improved approxi- 
mations by open models. Since the population constraint in the closed 
model tends to reduce the variability (see Section IV), a promising 
heuristic approximation procedure with the FPM method (suggested 
by H. Heffes) is to reduce the variability parameters in the approxi- 
mating open model. For example, with a Jackson network of single- 
server queues, we might treat each node as a D/M/1 queue instead of 
an M/M/1 queue, but the actual procedure would have to be more 
sophisticated. The general approach using QNA for non-Markov 
closed models is to cut the flow into one node and replace it with an 
external arrival process. First, as described in Section X, we can let 
the variability parameter of the external arrival process be such that 
it agrees with the variability parameter of the departure process from 
the network. We use QNA to calculate approximate variability param- 
eters for the arrival process to each node. Afterwards, to improve the 
approximation of the closed model, we can systematically reduce all 
these variability parameters. The reduction should depend on the 
network parameters, with the variability parameters evidently being 
reduced less as the number of nodes or the number of jobs increases. 
We briefly investigate this possibility, but we have just begun studying 
refined approximation procedures of this kind. 


1.3 The finite-waiting-room refinement 


We also propose a refinement of the FPM method for approximating 
closed models, which is especially useful for small models. We apply 
the network population constraint given for the closed model to each 
node separately in the open model. When there are K jobs in the 
closed model, we allow at most K jobs at each node in the open model. 
However, we implement this Finite-Waiting-Room (FWR) approxi- 
mation within the product-form equilibrium distribution of the open 
model. We act as if there is capacity K at each node in the open model, 
but we do not analyze the modified open model exactly. Instead, we 
keep the product-form equilibrium distribution in the open model, and 
modify the distribution of the number of jobs at each node. 

If N? is the equilibrium number of jobs at node 7 in the open model 
without the refinement, then we use the conditional distribution of 
N? given that N? < K. Since N? has the distribution of a birth-and- 
death process in a Jackson network, this conditional distribution 
obtained simply by truncating the original distribution at K and 
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renormalizing is tantamount to imposing a finite waiting room at that 
node in isolation. [This conditioning can also be used as an approxi- 
mation for more general models (see Refs. 15 and 16).] 

Let N? be the equilibrium number of jobs at node i by the FWR 
method; then 


P(N? = k) = P(N? =k|N? <= K) = P(N? =k)/P(N? < K). (1) 


For an M/M/1 queue, the mean is EN? = p;/(1 — p;) and the 
utilization is u? = P(N? > 0) = p;, where p; = d;/p; is the traffic 
intensity at node 1, based on the net arrival rate \; and service rate ;. 
The corresponding quantities EN? and a? with the FWR method are 


EN? = w(p;:, K)EN? 
w(p:;, K) = (1 — (K + 1)p¥ + Kp#*")/(1 — p¥**) 
ua? = P(N? > 0) = (0: — pi™)/ — pi), (2) 


provided that p; ¥ 1 (see Section 2.5 of Ref. 17). Obviously, EN? < 
EN? and u? < u?. If we let a? be free, fix u?, and let EN? = EN®, then 
u? > u? (see Section IV), which is a refinement in the right direction. 
Moreover, if u§ is the utilization of node i and N¢ is the equilibrium 
number of jobs at node i in the closed model, then u? = u$ when EN? 
< EN%. Since the ratio of the utilizations at any two nodes in the 
closed model is the same as in the open model,*® we thus obtain a 
valid lower bound on u§ by this procedure, namely, 


uf = uy = min{(u7/uj)u7} . (3) 
J 


We need (3) to obtain the valid lower bound because the property 
u?/u? = u§/u§ for all i and j does not hold for u?/u}. [See (15) and 
Section IV.] 

Example 2. Consider a closed Markov network with n single-server 
nodes and K jobs. Let the service rates and net arrival rates be 
identical, so that the equilibrium distribution is symmetric. When 
n=4and K = 2, the server utilizations by direct analysis of the closed 
model, the FPM method, and the FPM/FWR method are, respectively, 
0.400, 0.333, and 0.384. By using the FWR refinement, the error is 
reduced from 42 percent to 4 percent. 

Using the FWR refinement to the FPM method can yield external 
arrival rates for which p; = 1 at some nodes. Limited numerical 
experience indicates that the quality of the approximation often de- 
teriorates in this case. 


1.4 Decoupling infinite-server nodes 


There need not be many nodes for the FPM method to be effective. 
The FPM method is particularly appealing to approximately solve 
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closed Markovian networks with few nodes but a large population and 
an Infinite-Server (IS) node with relatively low service rate. For these 
models, the FPM method extends easily to multiple job classes. How- 
ever, the need for help with these difficult models is much less now 
because an efficient algorithm for them has recently been developed 
by McKenna, Mitra, and Ramakrishnan,'*"* which is implemented in 
their PANACEA software package.® 

The PANACEA algorithm exploits integral representations and 
asymptotic expansions to reduce the original large closed network to 
many much smaller closed networks. Under appropriate conditions, 
the difficult partition function of the original closed model and related 
quantities such as the utilization of a particular class at a particular 
node can be represented by asymptotic expansions in which the 
coefficients are constructed from the partition functions of the smaller 
closed networks (the pseudonetworks in Ref. 14, which typically in- 
volve at most three classes and a total of seven customers). Moreover, 
the asymptotic expansions permit a thorough analysis of the trunca- 
tion errors: The truncation error is less in absolute value than the first 
neglected term and has the same sign. 

The asymptotic expansions underlying the new capabilities in PAN- 
ACEA are based on several assumptions. First, it is assumed that each 
class visits an IS node [see (27) of Ref. 14]. Second, it is assumed that 
the population of each class is large [see (17) of Ref. 14]. Third, it is 
assumed that the individual service rates at the IS node are signifi- 
cantly lower than the service rates in the rest of the network [see (18) 
of Ref. 14]. Finally, it is assumed that utilizations of the non-IS nodes 
are not close to their critical values, i.e., they are not in heavy traffic 
[see (29) through (31) of Ref. 14]. It is worth noting that these 
assumptions are often realistic—e.g., in computing systems where the 
IS nodes correspond to “think times” at terminals. 

It turns out that the FPM method tends to work well under these 
same conditions. Unlike PANACEA, however, the FPM method is an 
approximation. (The asymptotic expansions in PANACEA also can 
be regarded as approximations, but of a different kind; they are a 
numerical method that can achieve any degree of accuracy given 
enough computation. On the other hand, the FPM method changes 
the model, so that the answers are good only if the two model solutions 
are close.) The FPM method in this situation can be derived by a 
procedure that at first seems to be different from the FPM method. 
This alternate procedure is motivated by the observation that under 
the stated conditions the departure processes from the special IS nodes 
tend to behave much like Poisson processes. Moreover, the subnetwork 
without the IS nodes tends to behave much like an open network with 
an external Poisson arrival process. This is partly substantiated by 
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previous work’ in which we showed that, under appropriate condi- 
tions, the departure process from an IS node with a fixed general 
stationary arrival process and a general service-time distribution ap- 
proaches a Poisson process that is independent of the arrival process 
as the individual service rate at the IS node decreases. Reference 18 
does not directly apply here because the arrival processes at the IS 
node are changing too, but Ref. 18 suggests that the departure proc- 
esses from the IS nodes are approximately Poisson processes that are 
independent of the rest of the network under the stated assumptions. 
Corresponding limit theorems for the situation here are contained in 
Section VIII. 

The key to our procedure for these models, as with the asymptotic 
expansions underlying PANACEA, is a large population and the 
presence of IS nodes with relatively low service rates. The FPM 
method can be used more generally, but there is stronger supporting 
logic with the IS nodes. The FPM method works well if there are 
several IS nodes, as long as each class visits one of them, but for 
simplicity we assume that there is a single IS node visited by all 
classes. Also, each class can visit more than one IS node, but we 
assume that only one IS node has relatively low service rate, so that 
jobs tend to accumulate there. We use this IS node to decouple the 
network. We let its departure process leave the system and replace it 
by an external arrival process. The external arrival process is a set of 
independent Poisson processes, with one Poisson process for each 
class. Equivalently, there is a simple Poisson external arrival process 
and fixed probabilities that each arrival belongs to one of the classes. 
We approximately solve the original closed network by identifying the 
appropriate external arrival rates for the associated open model. We 
use the special IS node to determine what rates are appropriate. We 
do this by simply equating the arrival and the departure rates for each 
class at the IS node. It turns out that this procedure is equivalent to 
the FPM method discussed above (see Section VI). 


Example 3. To illustrate the FPM method with an IS node having 
relatively low service rate, we consider a central processor model 
treated by McKenna, Mitra, and Ramakrishnan.’® This is a closed 
cyclic network with only two nodes. The first node is the CPU, which 
has a single server, where service is provided according to the proces- 
sor-sharing discipline. The second node is a “think” node, which is an 
IS node, representing independent delays at terminals before a job is 
next sent to the CPU. (Because of insensitivity properties,°*!*”° only 
the means of the service-time distributions matter for the equilibrium 
distribution. We can also equivalently regard the CPU as an FCFS 
node with an exponential service-time distribution.) We shall consider 
the case of one job class, which is test problem I described in Table I 
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Table I—A comparison of throughputs using closed and open models for the two-node single-class network model 
of a central processor in Example 3 of Section 1.4 


Throughput or Utilization of CPU 


For the Closed Model For the Open Model Using FPM Method 
From Ref. 13 Taylor’s Series (39) 
Number of By By Version 2.1 First Upper First Lower FPM Solution 
Jobs CADS PANACEA of PANACEA Bound (30) Bound (31) (28) and (88) TwoTerms Three Terms 
10 0.0417 0.0414 0.0415 0.0417 0.0415 0.0415 0.0415 0.0415 
50 0.207 0.207 0.2073 0.2083 0.2072 0.2072 0.2074 0.2072 
100 Breakdown 0.413 0.4138 0.4167 0.4137 0.4137 0.4150 0.4143 


200 Breakdown 0.819 0.8204 0.8333 0.8123 0.8150 0.8298 0.8269 


of Ref. 13. The mean service times at the two nodes are 1 and 240, 
respectively, so indeed the IS node has relatively low individual service 
rate. 

This example was significant in Ref. 13 because it demonstrated the 
advantage of PANACEA over previous closed network algorithms, in 
particular CADS.”' In several cases CADS was unable to obtain a 
solution. This example is also significant here because for it the FPM 
method is both easy and accurate. The FPM method only requires the 
solution of a quadratic equation [see (38) in Section VII]. A quick 
upper bound on the CPU utilization can be found by simply multiply- 
ing the population times the IS individual service rate [see (30) in 
Section VI]. Both of these methods perform remarkably well. The 
throughput results for several population sizes are described in Table 
I. The CPU node has a processor-sharing service discipline with mean 
processing time of yj’ = 1. Hence, the throughput of jobs at the CPU 
is equal to the utilization of the CPU. Node 2 is an infinite-server 
delay node representing the think time of users at terminals. The 
mean think time is wz! = 240. 

For the cases in Table I it is apparent that even the trivial upper 
bound is adequate for practical purposes. Moreover, as we have re- 
marked, the FPM throughput itself is a lower bound (see Section IV), 
so that from the FPM method alone we can determine that the quality 
of the approximation is satisfactory. Since the FPM throughput is a 
lower bound, from Table I we see that in some cases the FPM 
throughput is actually slightly more accurate than the published values 
in Ref. 13, but of course the differences are not significant for practical 
purposes. In the difficult case of 200 jobs, Version 2.1 of PANACEA 
terminated with lower and upper bounds of 0.8129 and 0.8204, based 
on four terms of the asymptotic expansion. In the other cases the two 
bounds coincide for the specified accuracy. The main point is that 
essentially the same answers can be obtained quickly by hand. (See 
Sections VI and VII for additional discussion.) 

With the FPM method we avoid closed networks and the associated 
partition functions entirely. Instead, we approximately solve the orig- 
inal closed network by solving a related open model. In the case of 
multiple job classes, we iteratively solve a sequence of associated open 
networks. By working with open networks, we never calculate the 
complete distribution of the number of jobs of each class at each node. 
With open networks it suffices to work with expected values. By 
exploiting simple monotonicity properties, we are also able to give 
upper and lower bounds on the desired approximate solution at each 
iteration. Finally, we are able to treat very general networks; e.g., the 
subnetwork can have multiserver nodes. In fact, the approximation 
procedure is ideal for the closed-model analogs of the non-Markov 
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networks analyzed by QNA,’ in which there are FCFS nodes with 
nonexponential service-time distributions. For the first step of the 
analysis, in which we replace the departure process from the IS node 
by an external Poisson arrival process, the FPM method is still 
asymptotically correct. Hence, for these closed non-Markov models 
with decoupling IS nodes, it appears that the FPM method with QNA 
should perform about as well as the original QNA approximation for 
the corresponding open non-Markov model. 


1.5 The rest of this paper 


The rest of this paper is organized as follows. In Section II we 
discuss population variability in open networks, and show that it tends 
to be relatively small in large networks, which supports using the FPM 
method to approximate large closed models. We also suggest measuring 
population variability to help decide which model to use. In Section 
III we compare the throughputs in closed models and open models 
(using the FPM method) for a special class of balanced networks, and 
we show that the differences are, for practical purposes, negligible 
when the network is large (but not when the network is small). This 
example provides a convenient, simple, quantitative characterization 
of the difference between the models, as far as throughputs are 
concerned. The explicit balanced network results also suggest possible 
refinements of the FPM method for unbalanced networks. 

In Section IV we present theoretical results about the closed network 
and the FPM approximation. We prove that the utilizations and 
throughput calculated by the FPM method are always lower bounds 
for the corresponding quantities in the closed network (see Theorem 
1). For the special case of single-server and infinite-server nodes, this 
result can also be deduced from Zahorjan.” We not only treat general 
nodes such as multiserver nodes, but we treat models with several job 
classes. To make our comparison and establish other properties of the 
closed model, we exploit the log-concavity”® of the distribution of the 
number of jobs at each node in the associated open model. In various 
ways we show that the distribution of jobs is less variable in the closed 
network than in the associated open network. In particular, given 
ordered means at any node, we establish increasing concave stochastic 
order (see Theorem 2).”* To do this, we introduce and apply the notion 
of one distribution being log-concave relative to another (see Defini- 
tion 1). 

It is intuitively clear that the population constraint should introduce 
negative dependence among the queue lengths at the different nodes. 
In Section IV we also show that recently developed concepts of 
multivariate negative dependence, such as reverse-rule distribu- 
tions”>”* and negative association,”’ are ideally suited to make this 
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idea precise. Indeed, the multivariate distribution of jobs at the differ- 
ent nodes has all of these properties. The closed Markovian network 
model can be regarded as a canonical example of negative dependence. 

In Section V we present some additional theory to show that the 
FPM method tends to perform well for large networks. We prove that 
the FPM method is asymptotically correct as the number of nodes in 
the closed model increases with the number of customers per node 
held fixed (see Theorem 7). For this result, we can let the network 
grow in acyclic manner so that the connectivity does not increase. T’o 
obtain a rigorous proof, we apply the local central limit theorem for 
partical sums of random vectors.” This result applies to Example 1 as 
a special case. 

In Section VI we present the variant of FPM method to approximate 
closed networks with a large population and a decoupling IS node with 
relatively low service rate. We exploit monotonicity to obtain an 
efficient algorithm for multiple job classes. In Section VII we illustrate 
the FPM method in this context by considering the central processor 
model in Example 3. We consider cases involving two job classes as 
well as one. This example is taken from Ref. 13, so that we can 
conveniently compare the FPM method to numerical results for PAN- 
ACEA and the CADS algorithm for closed models.®”! 

In Section VIII we present theoretical results to support the FPM 
method in the context of Sections VI and VII. We show that the 
vector-valued queue-length process in the subnetwork of the closed 
model without the IS node converges in distribution to the correspond- 
ing stochastic process in the approximating open model with a Poisson 
external arrival process as the populations increase and the individual 
service rates at the IS node decrease appropriately. We establish 
convergence in distribution (weak convergence”*”’) of both the sto- 
chastic processes (see Theorem 8) and the equilibrium distributions 
(see Theorem 9). Convergence of the departure process from the IS 
node to a Poisson process is established as in Refs. 18 and 30; 
convergence of the associated vector-valued queue-length process is 
established by model continuity.*!°* As a consequence, the FPM 
method is asymptotically correct for the closed model under these 
conditions (see Theorem 12). In Section VIII we also make stochastic 
comparisons between the stochastic processes in the open and closed 
models, exploiting couplings or almost-surely ordered constructions as 
in Refs. 33 and 34. The stochastic comparisons are interesting in their 
own right, but they also play a role in establishing the convergence. 
We show that a first upper bound for the FPM method is also an 
upper bound for the closed model in terms of both transient and 
equilibrium throughputs and queue lengths (see Corollaries to Theo- 
rems 10 and 11). 
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In Section [X we discuss similar approximations for closed networks 
with a bottleneck node which is not an IS code. We propose a different 
approximation for closed networks with a bottleneck node. We delete 
the bottleneck node from the closed network, but we do not use the 
FPM method. Instead, the proposed approximation is to simply use 
the open model obtained by deleting the bottleneck node and replacing 
its departure process by an external arrival process generated by the 
service times at the bottleneck node. The difference between the total 
population and the expected population in the open subnetwork is the 
suggested approximation for the expected population at the bottleneck 
node. This approximation method is also asymptotically correct as the 
population grows. The vector-valued queue length process in the 
subnetwork of the closed network without the bottleneck node con- 
verges in distribution to the corresponding process in the open network 
as the population grows. This phenomenon is of course quite well 
known,*** but some of the supporting theory here seems to be new. 

In Section X we discuss methods for approximately solving non- 
Markov closed models. We indicate how existing procedures for non- 
Markov open networks such as the Queueing Network Analyzer (QNA) 
can be modified for this purpose. In particular, we describe in detail 
the changes in Ref. 7 to implement the FPM method. 

In Section XI we provide some additional motivation for considering 
special algorithms to analyze non-Markov closed networks. It is some- 
times claimed that Markov models with exponential service-time 
distributions adequately describe throughputs for single-server FCFS 
nodes with nonexponential service-time distributions with the same 
mean, but we show that this is not always the case. We use tight lower 
bounds on the throughput in closed models with FCFS single-server 
nodes and general service-time distributions identified by Arthurs and 
Stuck." For highly variable distributions, the actual throughput can 
be much less than predicted by the Markov model. In fact, the Markov 
model can be arbitrarily bad. The true throughput can be arbitrarily 
close to zero, while the Markov model throughput is arbitrarily close 
to one. 

In Section XII we make additional numerical comparisons that help 
put the different models and approximation procedures in perspective. 
We draw some conclusions in Section XIII. 

This paper contains diverse material, ranging from heuristic algo- 
rithms and examples to theorems and proofs. These are intended to 
complement each other, but the primary algorithm sections (Sections 
VI and X) and mathematics sections (Sections IV, V, and VIII) can 
be read independently. 


1.6 Other bounds and approximations 
In this paper we introduce several approximation procedures and 
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establish several bounds for networks of queues. Of course, many other 
approximations and bounds have already been developed by others. In 
addition to the previously mentioned balanced network bounds and 
others in Refs. 9 through 11, there are useful bounds in Refs. 37 
through 39. There is great potential for combining them in new ways. 
We focus on the basic Markov network models and natural non- 
Markov extensions obtained by allowing nonexponential service-time 
distributions and non-Poisson arrival processes. However, the results 
also have relevance for more complicated models and other approxi- 
mation procedures. 

For example, open-model representations such as the FPM method 
can be applied in conjunction with aggregation-decomposition approx- 
imation methods for closed networks, as suggested by Zahorjan.”” The 
basic approach is to replace a subnetwork of a closed network by a 
single “composite” node with a state-dependent service rate (see pp. 
165 through 172 of Ref. 6). For the product-form models, by Norton’s 
theorem, the aggregation step is exact if when there are m; jobs of 
class j in the subnetwork, the composite node service rate of class j is 
precisely the throughput rate for class j for the subnetwork in isolation 
(as a closed model with that population vector) (see pp. 100 and 106 
of Ref. 6). The FPM method can be used as an approximation here to 
calculate approximate throughputs for the subnetworks. 

The FPM method is a very natural idea, so no doubt it has been 
considered before. In fact, we have indicated that it appears in the 
asymptotic analysis in Ref. 2. The FPM method is also intimately 
related to another approximation procedure, which is called the Ap- 
proximate Infinite Source (AIS) method and was proposed independ- 
ently by Fredericks.*° The idea in the AIS method is to replace a finite 
source by an infinite source, in particular a Poisson process, so that 
Little’s formula®*’® relating the throughput, expected population, and 
expected sojourn time remains valid. However, since the original 
population with a finite source is fixed, it is easy to see that this 
constraint is equivalent to making the expected equilibrium population 
in the open model coincide with the actual population in the closed 
model (with the finite source). Hence, aside from additional refine- 
ments, the AIS and FPM methods coincide. Fredericks illustrates the 
effectiveness of this approach with other examples, including a two- 
class priority service system with separate finite sources. 


I]. POPULATION VARIABILITY IN AN OPEN NETWORK 


At first glance, it might seem that the open model with fixed mean 
number of jobs would always differ dramatically from the associated 
closed model with fixed actual number of jobs. It might seem that the 
population variability in the open network would necessarily be much 
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greater than in the closed network (for which there is no population 
variability at all). Indeed, for networks with few nodes there typically 
is a dramatic difference in the variability, but it turns out that the 
population variability of an open network tends to decrease as the 
number of nodes increases. This suggests that the FPM method should 
work well for large networks. 

In one sense the variability in an open network increases as the 
number of nodes increases. Since the equilibrium numbers of jobs at 
the different nodes in an open model are independent, the variance of 
the equilibrium population is the sum of the variances at the nodes. 
So, roughly speaking, the population variance tends to grow as the 
network grows (assuming that the marginal distributions at the indi- 
vidual nodes do not change). 


2.1 The population squared coefficient of variation 


However, we believe that the squared coefficient of variation (the 
variance divided by the square of the mean) is usually a better measure 
of the relevant variability than the variance. It describes the variability 
relative to the mean. Suppose that in the open network under consid- 
eration there are n single-server nodes with the traffic intensity at 
node i being p;. Since the equilibrium number, N@, of jobs at node i 
has a geometric distribution, the mean, variance, and squared coeffi- 
cient of variation of N? are 


E(N?) = pi/(1 — pi), 
Var(N?) = pi/(1 — pi)”, ands c?(N2) = 1/3. 


Obviously, c?(N?) can be arbitrarily large, so that we cannot expect 
the variability to be small with one single-server node. 

The associated parameters of the equilibrium total number, N°, of 
jobs in the entire network are . 


E(N°) = E(N%) + +++ + E(N2), 
Var(N°) = Var(N{) + --- + Var(N%), 
c?(N°) = Var(N°)/E(N°)?. (4) 


When the traffic intensities are all equal, i.e., p; = p for all i, c?(N°) 
= 1/np, so that c?(N°) tends to decrease rapidly as the number of 
nodes increases. There is a law of large numbers effect when there are 
many nodes.*! This is also true as n increases when the traffic intens- 
ities are unequal, provided that E(N?) is asymptotically negligible 
compared to 1, E(N9). By the central limit theorem,*! the distribu- 
tion of N° tends to be approximately normally distributed with the 
mean and variance in (4). 

It is easy to see that c?(IN°’) can be very large if there are relatively 
few nodes all in light traffic. If there is a single bottleneck node in 
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heavy traffic, then c?(N°) = 1, which may also be regarded as signifi- 
cantly different from zero. This indicates that the FPM method might 
not be desirable with a bottleneck node (see Section IX). However, if 
there is no bottleneck node and if there are several nodes with at least 
moderate traffic intensity (and any number of other nodes), then 
c?(N°) will not be large. For example, if there are six nodes with p; = 
2/3 for each i, then c?(N°) = 0.25. 


2.2 Several servers 


It is also worth noting that the equilibrium distribution at a node 
usually tends to become less variable as the number of servers in- 
creases, so that the single-server case we have just considered tends 
to be the worst case for variability. This is perhaps not true for IS 
nodes, which often are “delay” nodes. If ); is the arrival rate, y; is the 
individual service rate, and a; = d;/p; at node 1, then since IS nodes 
have a Poisson distribution, 


E(N2) = ai, Var(N?) =a; and c?(N?) = 1/a;. 


If \; and y; are the same as in the single-server case, then so is c?(N9). 
If X;/u; tends to increase as the number of servers increases, then 
c?(N?) decreases as well. In an M/M/s queue, it is possible to show 
that c?(N°) decreases as s increases and converges to 0 as s —> ©, 
provided that we fix the probability of delay (see Ref. 42). With p; = 
d,/Hisi, this is tantamount to having (1 — p,;) Vs; > 6;, 0 < B; <1, as 5; 
increases. In other words, if we only increase s;, then p; decreases and 
\;/u; remains unchanged, but if we adjust p; as we change s; to reflect 
the corresponding congestion, then the distribution tends to concen- 
trate. In particular, we then have E(N?) — ©», Var(N?) — ©, and 
c7(N3) > 0 ass. 


2.3 Practical implications 


The rather informal analysis in this section indicates that the 
population variability measured by c?(N°) in an open network will 
often be surprisingly small if (1) the network has quite a few nodes, 
(2) the network is not in light traffic, and (3) the network is not 
dominated by one or two bottleneck nodes. Perhaps the most impor- 
tant idea is the possibility of using c?(N°) to help determine whether 
an open model or a closed model is more appropriate. In an application 
it seems appropriate to measure the real system and estimate the 
population mean and variance. Then estimate c?(N°) for the open 
model to judge the quality of the fit. 


2.4 Reducing the variability of the arrival processes 


As mentioned in Section I, we might try to improve the open-model 
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approximation of a closed model by artificially reducing the variability 
of the arrival processes in the open model. For example, we might 
replace each M/M/1 queue by an E,/M/1 queue, where EF; is an Erlang 
distribution with the same mean. Of particular interest is the limiting 
case as k —> ©, D/M/1. It is significant that indeed both Var(N?) and 
c?(N°) decrease as we increase k; in fact, Var(N%) and c?(N9) for 
D/M/1 are the least possible values among all GI/M/1 queues with 
the same arrival rate and service rate. To see this, recall that for a 
GI/M/1 queue 


Var(N?) = pi(1 — pi + o:)/(1 — oi)? and 


c*(N?) = (1 — pi + o1)/pi, (5) 
where g; is the probability of delay, which is the root of the equation 
pi(ui(l — o:)) = 9; (6) 
for 
oi(s) = i ed F(t) (7) 
0 


with F; the interarrival-time cdf of the GI/M/1 model for node i (see 
II.3 of Ref. 48). 

The relationship is appropriately expressed in terms of stochastic 
orderings. We say that one random variable X, is less than or equal 
to another X, in the sense of stochastic order (denoted by X; Ss Xo), 
increasing convex order (denoted by X, Sic X2), and convex order 
(denoted by X, <, X2), respectively, if Eg(X1) <= Eg(X2) for all non- 
decreasing, nondecreasing convex, and convex real-valued func- 
tions g for which the expectations are well defined (see Sections 1.3 
through 1.5 of Ref. 24). Since g(x) = x and g(x) = — x are both convex, 
convex order implies equal means. With equal means, convex order is 
equivalent to increasing convex order. It is significant that D <, Ep+1 
<,. E, <=. .M =<, He for random variables with a common mean. (H2 is a 
hyperexponential distribution, the mixture of two exponential distri- 
butions.) 

Let W be the equilibrium waiting time before beginning service. 
Stoyan and Stoyan showed that W, <;. W2 in two GI/G/1 queues with 
common service-time distribution when X, <, Xo, where X; is the 
generic interarrival time in system i.** For the special case of GI/M/1, 
Rolski and Stoyan showed that W; <,, W2 under the same condition.” 
Since o; = P(W; > 0), o1 S oe and, by (5), Var(Ni) < Var(Ne2) and 
c?(Ni) < c?(No2). Since EX <, X for any X, these quantities are 
minimized among GI/M/1 queues by the D/M/1 case. 

Table II shows how the variability is reduced by comparing the 
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Table II—A comparison of congestion measures for the M/M/1 and 
D/M/1 queues®” 


Traffic a ilies 
Intensity, Delay 

pi EN; — c(N?) __ Probability,o,; EN? — c?(N?) 
0.10 0.11 10.00 0.000 0.10 9.00 
0.20 0.25 5.00 0.007 0.20 4.04 
0.30 0.43 3.33 0.041 0.31 2.47 
0.40 0.67 2.50 0.107 0.45 1.77 
0.50 1.00 2.00 0.203 0.63 1.41 
0.60 1.50 1.67 0.324 0.89 1.21 
0.70 2.33 1.43 0.467 1.31 1.10 
0.80 4.00 1.25 0.629 2.16 1.04 
0.90 9.00 1.11 0.807 4.66 1.01 
0.95 19.00 1.05 0.902 9.69 1.002 
0.98 49.90 1.02 0.960 24.50 1.000 


principal congestion measures for the M/M/1 and D/M/1 queues. The 
variability is reduced the most for traffic intensities near 0.5; e.g., for 
pi = 0.5 the D/M/1 value is 70 percent of the M/M/1 value. [From 
heavy traffic theory,*” we know that, as p; — 1, c?(N°) approaches 1 
for all GI/M/1 queues and, for D/M/1, EN? approaches one-half the 
M/M/1 value.] This analysis shows that the proposed technique for 
refining the approximations by artificially reducing the variability 
parameters of the arrival processes would indeed reduce the variability 
of the number of jobs at each node. It also indicates by how much. 
Since the means would also decrease, this device would also increase 
the throughput with the FPM method. However, it remains to deter- 
mine how much to reduce the variability of arrival processes and 
whether this will produce a good general approximation procedure for 
network models. 


IH. COMPARING THROUGHPUTS IN BALANCED NETWORKS 
3.1 The closed model with single-server nodes 


In this section, we compare the throughput in a closed model with 
the throughput in the associated open model using the FPM method. 
We still consider the Markov Jackson models, but for simplicity we 
restrict attention to single-server nodes. Given a closed model con- 
taining n single-server nodes and a fixed population, K, construct the 
associated open model by removing the arrival process to one node, 
say the first node, and replacing it by an external Poisson arrival 
process with sufficiently low arrival rate to have stability. Let the 
original arrivals to this entry node in the closed model leave the 
system. Then solve the traffic equations to obtain the arrival rates ); 
and traffic intensities p; for each node 1. Note that n — 1 of the n 
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traffic-rate equations for the original closed model are the same as for 
the new open model.® Since there is one degree of freedom in the 
traffic-rate equations for the closed model, the arrival rates ); calcu- 
lated for the open model are legitimate relative traffic rates for the 
closed model]; i.e., the ratio of the arrival rates at any two nodes thus 
is identical in the closed and open models. 

Let N? and N° be the equilibrium number of jobs at node i and in 
the entire system for the open model, and let N§ and N° be the 
corresponding quantities for the closed model. Obviously, N° = K. For 
the open network with the given external arrival rate, the expected 
equilibrium number of jobs is 


B(N*) = ¥ (pi/(1 ~ pi). (8) 


It is somewhat remarkable that the equilibrium distribution of the 
number of jobs at each node in the original closed model can be found 
by considering the associated open model we have constructed, even 
though the arrival process we have removed is not a Poisson process. 
The equilibrium distribution of the numbers of jobs at each node in 
the original closed model can be expressed exactly in terms of the 
solution for the open model constructed above;*° it is 
PIN@G=k, 1Sisn) ,? . 

P(N? = K) =G pL Pi, (9) 
where G is the normalization constant or partition function chosen so 
that the probabilities sum to one over the set of n-tuples (Rj, Re, «+: , 
k,) such that ki + ko + --- +k, =K. [Of course, this is partly explained 
by the fact that ratio of arrival rates at any two nodes are the same in 
the open and closed models. The one remaining degree of freedom, the 
arbitrary arrival rate in the open model, cancels in the division in (9).] 

The associated throughput in the closed model, say 6°, is then the 
flow through the designated node, i.e., the utilization u{ times the 
service rate py: 


P(N} =k, l1stsn)= 


6° = ui, = P(N{ > 0). (10) 


As usual, the throughput can be obtained by calculating the normali- 
zation constant recursively over smaller populations and subsets of 
nodes (see Section 5.5 of Ref. 6). 


3.2 The special case of a balanced network 


From (8) through (10), it is clear that the relation between the 
throughput and the total population is much more elementary for 
open models; for open models, we only need to know the expected 
total population, not the detailed distribution at the nodes. We make 
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an interesting explicit comparison, by considering the special case in 
which the traffic intensities at all the nodes are identical, say p. For 
the open model, (8) reduces to EN° = np/(1 — p), so that if we set 
EN° = K, we obtain p = K/(K + n). Thus, the utilization u3 and 
throughput @° in the open model are 


uf = K/(K+n) and 0° =), = Kw/(K +n) (11) 


because all external arrivals but no internal arrivals go to the desig- 
nated first node. 
On the other hand, for the closed model, 


1/Gp* = Axn = (" F = = : : (12) 


the number of ways K indistinguishable objects can be placed into n 
cells (p. 38 of Ref. 41). Similarly, 


P(N, = 0) = Ax, n-1/Ak, n = (n = 1)/(n + K 7 1), (13) 
so that the utilization and throughput in the closed network are 
ui = K/(K+n-—1) and 6° = Ky,/(n + K — 1). (14) 


From (11) and (14), we see that the two throughputs are very similar. 
Moreover, 6° > 0°, so that 0° is a conservative estimate of 6°. (‘This is 
always true; see Section IV.) 

It is significant that a good approximation for the throughput 6° in 
the closed model immediately provides a good approximation for the 
utilizations of all the nodes. As we have indicated above, the closed 
and open models are linked together in a very important way: The 
ratio of any two utilizations is always identical in both models; i.e., 


uf/u? = us/us (15) 


for all i andj. 

The difference between the two utilizations, say A, which for the 
balanced model is the difference between the throughputs in (11) and 
(14) normalized by dividing by the service rate 4, is 


A= uj — uf = (0° — 0°)/u. = K/(n + K)(n+ K-—- 1). (16) 


Note that 6° = yw, 0° = wi/2, and A = 1/2 when K = n = 1. We 
conjecture that A = 1/2, in general. The difference A is small if either 
n or K (especially n) is large but not if n and K are both small. 
Representative values of n, K, 0°, 6°, and A are given in Table III. In 
Table III the service rate at the entry node is ny; = 1. We also describe 
the population variability in the open model using (4) and (11), from 
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Table III—A comparison of throughputs for the network of single- 
server nodes with common relative traffic intensities 
considered in Section III 


Network Throughput 
P t M 
pia cbt eRe Ree rae Difference in Population 
Nodes Jobs Closed Open Throughputs Variability 
n K 0° 6° A (16) c?(.N°) 
2 2 0.67 0.50 0.17 1.00 
2 5 0.83 0.71 0.12 0.70 
2 20 0.95 0.91 0.04 0.55 
5 2 0.33 0.29 0.04 0.70 
5 5 0.56 0.50 0.06 0.40 
5 20 0.83 0.80 0.03 0.25 
20 2 0.10 0.09 0.01 0.55 
20 5 0.21 0.20 0.01 0.25 
20 20 0.51 0.50 0.01 0.10 


which c? (N) = (K + n)/Kn. The difference between the two models 
as described by both A and c?(N°) decreases as n and K increase. 
Table III quantifies the differences. 

Formulas (11) through (16) indicate that the FPM approximation 
for the throughput will be good if either the population, K, or the 
number of nodes, n, is large, but with single-server nodes it seems 
much better to have n large. The quality of the approximate queue- 
length distributions computed by the FPM method often deteriorates 
when there are nodes with high utilizations and few servers. Example 
1 in Section I is ideal for the FPM method; both K and n are large 
(K = n = 20), but the utilizations are not. The finite-waiting-room 
refinement in Section 1.3 is useful for the small models. 


3.3 Simple approximations for unbalanced networks 


We can use the results for balanced networks to obtain simple 
approximations for unbalanced networks. A simple rough approxi- 
mation, say O{pprox, for the throughput in a closed network with K jobs 
and n single-server nodes with unequal (but not too different) utili- 
zations based on (11) and (14) is 


Oo eproz => 8°(n + k)/(n + k= 1), (17) 


where 6° is obtained from the associated open model using (8), e.g., by 
simple search. We would not expect (17) to be good if there is a severe 
bottleneck node; we would be in serious trouble if we had six nodes, 
five having relative utilization 1 and the other having relative utiliza- 
tion 3. We also would not count nodes in relatively light traffic; if we 
had nine nodes, three with relative utilizations 1 and 6 with relative 
utilization 3, then it would be better to use n = 6 in (17). 
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We can also directly approximate 0° by replacing the traffic inten- 
sities at each node with the average traffic intensity over all nodes. 
This yields @opprox = Kui /(K + n) as in (11). Due to the convexity of 
EN?, O2pprox 2 9°, which is a modification in the correct direction if we 
wish to approximate @°. 

Finally, we could combine these two approximations to obtain (14) 
for unbalanced closed networks, but it appears that this would tend to 
overestimate 6°. In fact, (14) has already been shown to be an upper 
bound for the closed model.®"!! The simple approximation (17) worked 
very nicely in Example 1 in Section 1.2. 


3.4 Reducing the variability of the arrival processes 


As in Section 2.4, we can consider approximations for the closed 
model obtained by reducing the variability in the open model. Since 
EN? = pi/(1 — o;) in the GI/M/1 model, EN? also decreases as a; 
decreases, so reducing the variability of the arrival process at each 
node increases the throughput 6° in the open model. (Typical values 
of EN? for the D/M/1 queue are given in Table II.) If there are n 
identical GI/M/1 nodes, then instead of (11) we have 


0° = Ky,/(n + xk), (18) 


where x = a/p. Since o depends on p via (6), (18) is harder to solve. 
Moreover, a direct application of the D/M/1 model need not yield 
good results because (18) can be much greater than (11). For example, 
if K = 10, n = 16, and p, = 1, then 6° = 0.40, while 6° = 0.385 and 0.50 
via (11) and (18), respectively. It remains to determine how to exploit 
this approach. 


3.5 Several servers 


It is also interesting to consider networks of n identical multiserver 
nodes (back with the Markov models). When there are s servers with 
1 < s < o, the formulas are rather complicated, but the situation 
simplifies greatly for s = «0. Then EN¢ = EN? = K/n and @° = #° = 
Ku,/n, so that there is no difference at all. We conjecture that (6° — 
6°)/, decreases in s, which would mean that the single-server case we 
have examined gives the worst approximation. 

For the open model in which each node has s servers and a common 
traffic intensity, p = ,/su1, 0° can be approximated by solving 


n[ps + dp/(1 — p)] = K, (19) 


where 6 is the probability of delay at node 1 (which also depends on 
dx) (see Ref. 42). A possible procedure is to approximate 6 first and 
then solve the resulting quadratic equation for \. One could then 
iterate, recalculating 6, etc. 
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IV. SUPPORTING THEORY FOR COMPARING THE MODELS 
4.1 A lower bound for the closed model 


For the special example in Section 3.2, we saw from (11) and (14) 
that 6° > 0°. In general, we should expect the throughput to be greater 
in the closed model because it is intuitively obvious that N% is less 
variable than N}?. Given the same mean, N? is evidently more likely to 
assume both very large values and very small values, so that we should 
have P(N? = 0) > P(N$ = 0). Of course, we need not actually have 
EN? = EN¢ when EN? = N’‘, but this is the idea. 

In this section we justify this reasoning. We assume that the open 
model is constructed from the closed model as described in Section 
3.1. We consider the Markov Jackson network with one job class and 
multiserver nodes as specified in Section I, but it is significant that 
the throughput comparisons extend to Markov networks with multiple 
job classes and more general state-dependent service rates at the nodes. 
Some of these comparisons also apply to the finite-waiting-room 
approximation introduced in Section 1.3. To avoid complicated nota- 
tion, we only discuss these extensions in remarks after Theorem 4. 
Theorem 1: If EN° < N‘, then 0° S 0° and u? S u§ for all 1. 

For the special case in which all nodes are either single-server or IS 
nodes, Zahorjan” has given a nice proof of Theorem 1. We give a 
different argument that allows us to treat more general nodes, e.g., 
multiserver nodes, and obtain some interesting additional results along 
the way. To establish Theorem 1, we use notions of concave ordering, 
which are closely related to the convex orderings introduced in Section 
II (see Section 1.4 of Ref 24). One random variable X, is less than or 
equal to another X2 in concave (increasing concave) ordering, denoted 
by X1 Sey X2 (X1 Sicy X2), if Eg(X1) S Eg(X2) for all concave (increasing 
concave) real-valued functions g for which the expectations are de- 
fined. The connection to the convex orderings is simple: X; Sev Xe if 
and only if X; =, Xo; X1 Sicy X2 if and only if — X, 2; — X2. The 
following basic characterization for random variables with values in 
the nonnegative integers is useful: X, S;,., X2 if and only if 


XY P(X, sk) = Y P(X S k) (20) 
k=0 k=0 
for all n (see Sections 1.3 through 1.5 of Ref. 24). 

As a basis for Theorem 1, we establish the following result. 
Theorem 2: If EN? s EN; for any node i, then N? Sicy N%. 

In fact, Theorem 2 directly implies Theorem 1 given (15). Let u‘ 
and u? be the utilizations of node 7 in the closed and open models, 
respectively. They are both defined as the expected number of busy 
servers; e.g., for the open model, 
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u? = E(min{N%, s;}) = pis; = di/ui, (21) 


where, of course, \; is the net arrival rate determined by the traffic 
rate equations plus the external arrival rate, which is, in turn, deter- 
mined by the FPM requirement that EN° = N°. Formula (21) is also 
valid for the closed model. 

To prove Theorem 1, we use the following consequence of Theorem 
2s 
Corollary to Theorem 2: If EN? s EN%, then u? S us. 
Proof: Apply Theorem 2 observing that the function in (21) is increas- 
ing and concave. O 
Proof of Theorem 1: If EN° = N‘, then EN? s EN‘ for some i because 

rh, ENS = EN‘ = N‘. For one such 1, u? S u§ by Theorem 2 and its 
corollary. By (15), u? s u§ for all 1. Since 6° = u4 pw; and 0° = uf mw, 
é°> 6° too. UO 

To prove Theorem 2, we use notions of log-concavity (see p. 70 of 
Ref. 23). A probability mass function {p;, k = 0} is log-concave if 


Di = DeviDr-1; R21. (22) 


A log-concave distribution is unimodal; moreover, it is strongly uni- 
modal, i.e., the convolution with any unimodal distribution is also 
unimodal. In fact, for discrete distributions log-concavity, strong uni- 
modality, and the PF, (Polya frequency function) property are all 
equivalent.”* The equilibrium distribution of any birth-and-death proc- 
ess is log-concave if the birth rates are nonincreasing and the death 
rates are nondecreasing (see example 5.7F in Ref. 23). Moreover, log- 
concavity is preserved under convolution. Hence, for each i and m the 
distributions of N? and N{ + --- + N%, are log-concave. (By example 
5.7F in Ref. 23 referred to above, it suffices for the service rate at each 
node to be a nondecreasing function of the number of jobs present.) It 
turns out that this is also true for the more complicated distributions 
in the closed network. 

Theorem 3: Let the service rate at each node be a nondecreasing func- 
tion of the number of jobs present. For any m the distribution of 
No +--+ + N% ts log-concave. 

Proof: Consider m = 1. Since log-concavity is preserved under convo- 
lution,”’ the distribution of N3 + --- + N% is log-concave. Then note 
that 


P(NG =k + 1) 
P(NG = k) 
_ P(NG=k+1)P(N3 + --- + N3=K—k-1) 


Pi? =k)P(N3 + + N2=K—-&)” 
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with the right-hand side being the product of two ratios, both decreas- 
ing in k. A similar argument applies tom>1. O 

From (23), we see that in some sense the distribution of N$ is more 
log-concave than the distribution of N?. We now formalize this notion. 
Definition 1: One probability mass function {p}, k = 0} is said to be 
log-concave relative to another {pz, k = 0} if (phs1p%)/(phpta1) is 
nonincreasing in k. 

From (238) it is obvious that N¢ is log-concave relative to N?. We 
now show that this supplies what we need for Theorem 2. The key 
property is that relative log-concavity implies that the ratio p},/p? is 
unimodal.” 


Theorem 4: If the distribution of a random variable X2 is log-concave 
relative to the distribution of another random variable X, and EX, S 
EXo, then Xi Sicv Xo. 

Proof: Our goal is to verify (20). We first show that P(X, = 0) = 
P(X2 = 0). If not, then the relative log-concavity implies that there 
is a ko such that P(X, = k) < P(X_ = k) for all k <= Ry and nok> kp. 
This would make X, stochastically larger than X»2, implying that 
EX, > EX:2, which contradicts an assumption. Hence, P(X; = 0) = 
P(X2 = 0). Next let k, be the first k, if any, such that P(X, = k,) S 
P(X2 = k,). By the relative log-concavity, we must have P(X; = k) Ss 
P(X2 = k) for all k = kp. Since EX; = Yip-9 P(X; = Rk), 


EX, — EX, = Y, [P(X = k) — P(X. <= k)| > 0 
k=0 
and 
XY P(X sk) = Y P(X Sk), n= 0, 


k=0 k=0 


which establishes (20). O 


Proof of Theorem 2: By (23), the distribution of Nf is log-concave 
relative to the distribution of N? according to Definition 1. By Theorem 
4, Nesiv Ni. O 

Remarks: 1. In a network made up entirely of infinite-server nodes, 
we have u? = uf for all i and 8° = 6°, so that we cannot have strict 
inequality in Theorems 1 and 2. 

2. Theorems 2 through 4 apply to the FPM/FWR method introduced 
in Section 1.38. Let N° be the equilibrium number at node i by this 
method. It is easy to see that Nis log-concave relative to N?, which 
in turn is log-concave relative to N?. Hence, if EN? <= EN?, then N? 
<i N3; if EN? < ENS, then N? <icy Nf. However, this does not yield 
a proof of the analog of Theorem 1 because the relationship (15) is 
lost. We do obtain the lower bound (3), though. 
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3. As indicated at the beginning of this section, Theorems 1 through 
4 extend to multiple job classes. There are many different ways to 
define the class structure, but we shall use only basic properties that 
have been established for the Markov models.°*® For the open model, 
the vector of jobs at the nodes without identifying the classes has the 
same equilibrium distribution as when there is only a single class, and 
given any number of jobs at node 7 in equilibrium, each job is of class 
J with some probability p;, independently of the other jobs. In other 
words, if N?is the total number of jobs at node i and N# is the number 
of class j jobs at node i, then N§ is obtained from N? Bernoulli trials 
with probability pj: 


P(N} = k) = my P(N? = n) (7) pi(1 — py)". (24) 
The key property is that the distribution of N3 in (24) is log-concave 
whenever the distribution of N?is log-concave. This result is intuitively 
reasonable, but not so easy to prove. The result is established in 
Theorem 2 of Ref. 46. Given that N% has a log-concave distribution 
and that N?; is independent of N?; when 1) ¥ ia, it is easy to extend 
all of the previous results in this section to multiple job classes. For 
example, the extension of Theorem 1 states that the utilizations of 
each class at each node are ordered, i.e., uf, < u§; for all i andj, if the 
expected class populations are ordered, ie., }21 EN} Ss 421 NY = 
K; for all 7. 

4, We have indicated that p’ being log-concave relative to p? implies 
that p}/p? is unimodal in k. We call this relationship Uniform Con- 
ditional Variability Order (UCVO), provided that p' and p? are not 
stochastically ordered, because all conditional distributions, condition- 
ing on a common subset, are either ordered again by UCVO or are 
ordered by ordinary stochastic order. This property parallels uniform 
conditional stochastic order,*”** and is studied further elsewhere.” 


4.2 Dependence in the closed model 

So far in this section (in Theorem 2 and Remark 4 above), we have 
shown how to express the idea that the distribution of the number of 
jobs at each node is less variable in the closed model than in the open 
model, but we have yet to describe the joint distribution at several 
nodes. Unlike in the open model, where the marginal distributions are 
independent, in the closed model the marginal distributions are de- 
pendent. If there are more jobs in one subset of nodes, then there 
should be fewer jobs at another disjoint subset of nodes. The popula- 
tion constraint obviously should make the populations at different 
nodes negatively correlated. We can make these ideas precise using 
recently developed concepts of negative dependence. 

One concept of negative dependence is the Multivariate Reverse 
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Rule distribution (MRR_), which was introduced by Karlin and Rin- 
ott.2> Let p be a multivariate probability mass function on the n-fold 
product of the nonnegative integers. The distribution p is said to be 
MRR; if 


p(x V y)p(x A y) S p(x)p(y) (25) 
for all x = (m, ---, xn) andy = (1, ---, yn) in {0, 1, ---}", where 
x Vy = (max{x,, yi}, ---, max{xn, yn}) and 
x Ay = (min{x, yi}, «++, min{xp, yn}). 


In contrast, if p satisfies (25) with the inequality reversed, p is said to 
be Multivariate Totally Positive (MTP: ).°° In both cases it suffices to 
check (25) for x and y differing in only two components. 

Unlike MTP, distributions,” the marginal distributions of an MRR, 
distribution need not be MRR». Moreover, even having the marginal 
distributions all MRR, is not strong enough to deduce some of the 
desired multivariate inequalities. Karlin and Rinott”> proposed one 
way to cope with this difficulty, by introducing a special subclass called 
the strongly MRR2 (SMRR-,) distributions. An n-dimensional proba- 
bility mass function p is SMRR, if the (n — m)-dimensional function 
DY P(x, +++ 5 Xn) Gilxj,) «++ Gm(x;,,) is MRR, for all m and all m-tuples 
of indices (ji, --+ , jm), where the sum is over all (xj, --+ , Xm) and ¢; 
is log-concave (PF.) for each 1. 

Block, Savits, and Shaked” introduced a convenient structural 
condition (condition N) that implies SMRR2. A random vector 
(Xi, ---, Xn) satisfies condition N if there is a vector of n + 1 in- 
dependent random variables (Yo, Yi, --- , Y,) each with a PF, density 
(or mass function) such that (X;, --- , X,) is distributed the same as 
[((Yi1, ---, Yn) | Yo + --- + Yn = 8] for some s. It is easy to see that 
condition N applies to the closed models as a special case; just set 
Yo=Oands=K. 

Another concept of negative dependence was proposed by Joag-Dev 
and Proschan.”’ They call random variables X,, --- , X, and their 
joint distribution Negatively Associated (NA) if, for every pair of 
disjoint subsets A; and Ag of the index set {1, 2, --- , n}, the covariance 


cov(f((Xi, teA1)), 8((Xj, JeAz))) = 0 (26) 


for all nondecreasing real-valued functions f and g defined on R™ and 
R”, where k; is the cardinality of A;. Their Theorem 8 directly implies 
that (N4, --- , N%) is negatively associated. We collect these properties 
in Theorem 5. 

Theorem 5: The vector (N{, --- , N%) ts negatively associated, satisfies 
condition N, is SMRRz, and has all marginal distributions MRR. 
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Proof: Since (N4, ---, N%) is distributed as (N{, ---, Na] Ni + 
..- + N° = K), condition N in Ref. 26 and the sufficient condition 
for the NA property in Theorem 2.8 of Ref. 27 are immediate. As 
noted in the Remark at the end of Section IV of Ref. 26, condition N 
implies SMRRz, which in turn implies that all marginals are MRR... O 


Remark: It is elementary to directly verify that all marginals are 
MRR. 

Many important consequences of Theorem 5 are described in Refs. 
25 through 27. We give some illustrative examples. 
Corollary 1 to Theorem 5: Suppose that ¢; are all nondecreasing or all 
nonincreasing functions on the nonnegative integers. Then, for any k, 
1lsk<n, 


E{oU(Ni) (N38) +--+ On(Nn)} 
<= E{o(N%) --- be(Ni)} X Efbari(N ie) -+- bn(Nn)}. 


Proof: Apply (1.5) of Ref. 25, noting that the PF, property is not used 
there. 
Corollary 2 to Theorem 5: For any m S n and any m-tuple (ki, ---, 


Rm), 


PING Ski, 1 sism) s I] P(N S ki) 


i=1 


and 


P(N¢ = ki, 1 Sis m) < [I P(NS = Rj). 


i=1 


Proof: Apply Corollary 1. O 


Remark: Theorem 5 and its corollaries describe multivariate depend- 
ence for a single job class. The multiple-job-class closed Markov 
network suggests a natural generalization of condition N in Ref. 26, 
which we call condition QN. A random vector X = (Xj: lsisn,1s 
J <= m) in R”™ satisfies condition QN if it is appropriately related to 
another random vector. The other random vector is Y= (Y;:0 sis 
n,1<j =m) in R™"*” such that (1) the subvectors (Yo: 1 <j < m), 
(Yi;;1 sj sm), ---, (Yn: 1 SJ Ss m) are mutually independent, (2) 
the random variables )}@, Y;; have a PF: density or mass function for 
each 1, and (3) given 1) Yi, (Yi, «++, Yim) has a multinomial dis- 
tribution for each i. We say that X satisfies condition QN if X is 
distributed as (Yj: 1 < i <n, 1 <j < m) conditional on Yo Yj = 
s; 1 <j < ™m, for some m-tuple (s), ---, sn). For m = 1, of course 
condition QN reduces to condition N. Condition QN is being investi- 
gated; a discussion of its properties is intended for a future paper. 


QUEUEING NETWORKS 1941 


4.3 Changing the population 


With closely related stochastic comparison concepts, we can also 
describe what happens when we increase the population in a closed 
network. Let (Ni(K), --- , N%(K)) be the equilibrium vector of jobs 
at each node as a function of the total population K. Naturally, we 
expect it to be increasing in K in some sense. In fact, this is true in a 
very strong sense. Following Karlin and Rinott,®° we say that one 
multivariate probability mass function p; is less than or equal to 
another p, in the sense of multivariate Monotone Likelihood Ratio 
(MLR), and we write pi Sir De, if - 


Pilx) poly) S pilx A y)pol(x V y) (27) 


for all x = (x1, --- , Xn) and y = (yi, --+ , Yn) in {0, 1, ---}”. The MLR 
order is a generalization of MTP, because p S;, p if and only if p is 
MTP.” MLR order implies stochastic order for the original distribu- 
tions (i.e., Dp; Sst P2) and also for all conditional distributions condi- 
tioning on sublattices. 

The probability mass functions of the full vector [N4(K), ---, 
N‘,(K)] are not usefully compared by (27) for different K because they 
have disjoint support sets. However, we can usefully compare the 
marginal distributions. There is a further complication, however, 
because (27) will obviously fail when x and y are in the support of 
both distributions but x V y is not. However, the ordering (27) does 
hold over every sublattice of the support. 

Theorem 6: Given a closed model with n nodes, [N§(K), --- , N4-1(K)] 
is nondecreasing in K in the MLR ordering in the sense that the support 
set {(Ri, +--+ , Rn): Ri t+ +++ + Rp-1 = K} in {0,1, ---}""1 is nondecreasing 
in K and (27) holds for K and K + 1 as an equality whenever the sum 
of the components of x V y is less than or equal to K + 1. 

Proof: It suffices to establish (27) for x and y differing by one in just 
two indices, say 1 and 2. Let x = (k, + 1, Ke, ks, «++ , Ra-1) and y = (ki, 
ko + 1, kz, -++, Rn-1). Let px be the probability mass function of 
[Ni(K), ---, N% (K)]. Then (27) holds as an equality provided that 
kh, + Ro + k,-1 + 2: K +1 because 


Dx (X) De+1 (y)/px (x A y) px+1 (x Vy) 


n-1 n-1 
P (w= K- by by -1)P (a= K+ 0 - y 4-1] 
j=l j=1 
n-1 n-1 
p(Ms=K- »> b) P (Na= cK +) - y 1-2) 
j=1 j=l 


=1. 0 
Corollary 1 to Theorem 6: For each i and K, N4(K) Si, Ni(K + 1). 
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Corollary 2 to Theorem 6: The utilizations uf{(K) are increasing in K. 
Proof: Apply Corollary 1 with the increasing function in (20). O 
Corollary 3 to Theorem 6: The conditional distribution of [N4(K), --- , 
N<-1(K)] given any sublattice of {0, 1, --- ,m}""! with maximal element 
(k,, --+ , Rn—1) is independent of K for K = Sy k;. 

Corollary 4 to Theorem 6: P(N{(K) = m|a; = Nj(K) sb, 1 sj s 
n — 1) is independent of K for K = SEX 6j. 

Proof: Apply Corollary 3. O 


V. LIMITS FOR GROWING NETWORKS 


In this section we provide some additional theory to show that the 
FPM method tends to perform well as an approximation for closed 
models when the network is large. For particular kinds of growing 
networks we prove that the FPM method is asymptotically correct as 
the number of nodes increases. We assume that the population is nK 
when there are n nodes. As we indicated in Section 1.2, the basic idea 
here is due to Gordon and Newell (see pp. 261 through 265 of Ref. 2), 
but we formulate and prove a limit theorem. 

To be precise, we must specify how the network topology and other 
model parameters change as the closed network grows. If the connec- 
tivity increases as the closed network grows, so that the departure 
processes are split into many components and the arrival processes 
are superpositions of many components, then it is usually possible to 
show that any fixed finite subset of nodes in the closed network 
behaves asymptotically as mutually independent queues (with mu- 
tually independent Poisson arrival processes). This is an even stronger 
form of independence than in the open model because it applies to the 
time-dependent stochastic processes as well as the equilibrium distri- 
bution. A simple example is the n-node network with routing of 
departures from every node to all other nodes with equal probability. 
Growing networks with increasing connectivity can be treated by 
classical limit theorems for superposition and thinning.*)” 

Motivated by Example 1 in Section 1.2, we formulate a limit theorem 
in which the connectivity does not grow with n. We define our sequence 
of closed models as follows. We start with a general open Markov 
product-form network having g nodes, p job classes, and a p-tuple of 
independent Poisson processes determining the arrivals (one for each 
class). We then replicate this network n times, letting the departures 
from network k be the arrivals to network k + 1. We let the routing 
probabilities at each subnetwork be identical. Finally, we make it a 
closed model by replacing the external arrival process to network 1 by 
the departure process from network n and stipulating that there are 
nK; customers of class j, 1 <j < p. A symmetric closed cyclic network 
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is the special case in which the initial building-block network has one 
node. Example 1 in Section 1.2 can be regarded as the special case in 
which the initial building-block network is a network with two single- 
server nodes in series, one with mean service time 1.0 and the other 
with mean service time 1.2. The numerical calculations were for n = 
10. 

The following result shows that the FPM method is asymptotically 
correct for such cyclically growing networks as n — ~. In Remark 1 
following the proof, we indicate that Theorem 7 also applies to growing 
networks with increasing connectivity. Let N{j;, (n) be the number of 
class j jobs at node i of the kth subnetwork in the nth closed model; 
let N2;, (\) be the number of class j jobs at node i of the kth subnetwork 
in the associated open model having independent Poisson arrival 
processes with rate vector A at the first subnetwork, with all departures 
from the nth subnetwork leaving the system. 


Theorem 7: For any ko and (qpko)-tuple (miu, +++ , Mepky)s 
lim P(N¢;.(n) = mj 1 Sisql1sjsp,1sks kh) 


n-——0o 


ko @ 
iris i P(N3i(Q) = mix 1 SJ Sp) = Z, 
where } = (Ai, --+, Ap) Is the FPM solution for the original q-node 
open building-block network. 


Proof: Let M; = so Yh mij, for 1 <j <p. As in (9), 
P(N§,.(n) = mj 1 Sisq,l1sjsp,1sk<k,) 


nq 
z-P ( 2; } Nin(a) = nk) - Mz 1s) <p] 
k=ko+1 i=1 


n q 
P (5 x Neje(d) = nKj: 1 Sj s P) 
for any vector of external arrival rates \ for which there is stability. 
(Z is defined in the statement of Theorem 7.) For the special case in 
which ) is the FPM vector, we can apply the local central limit 
theorem, pp. 75 through 79 of Spitzer,’® to obtain our desired result. 
Suppose that \ is such that FE 11, N2)1(\) = K; for each j. Since 
Dat (YL, N3jx(\) — Kj) is the jth component of the nth partial sum 
of iid. random p-tuples where each component has mean 0, there 
exists a, 0 < a < ©, such that 


nq 
lim nor (5 >, N3 jx (A) = nK; + a;: 1 <i<p)= Qa 


noo k=1 i=1 


for any p-tuple (a), ---, ap). (It is easy to see that the aperiodicity 
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requirement in the local central limit theorem is satisfied.) We thus 
obtain the desired result by multiplying both the numerator and 
denominator by n?” and lettingn-». O 

Remarks: 1. Theorem 7 and its proof also apply to other kinds of 
growing networks in which the connectivity does increase. For exam- 
ple, consider the symmetric n-node network in which each node has 
the same external Poisson arrival process with rate \,; and probability 
r; of departures leaving the system for class j, 1 <j < p, independent 
of n. Also, let departures staying within the network be routed to all 
other nodes with equal probability. Then the arrival rate of class j at 
each node is ),;/(1 — r;), independent of n. Then the equilibrium 
vectors of jobs at any node in the open model are the same for all 
nodes and are independent of n, so that Theorem 7 and its proof 
applies with g = 1. 

A more complex symmetric growing model with increasing connec- 
tivity that can be treated the same way is obtained by replacing each 
node in this example by a qg-node subnetwork. The departures from 
each g-node subnetwork staying within the system would be routed to 
each possible g-node subnetwork with equal probability. Each g-node 
subnetwork would also have its own external arrival processes. Then 
the equilibrium vector of jobs of each class at each node in the open 
model is the same for all g-node subnetworks and is independent of n. 

2. Gordon and Newell propose a refinement to the FPM approxi- 
mation for large networks, (27) in Ref. 2, which is obtained by 
approximating the probabilities involving the large partial sums in the 
numerator and denominator of by the normal density function. This 
is justified by the Remark on p. 77 of Ref. 12. 


VI. THE FPM METHOD WITH A DECOUPLING INFINITE-SERVER 
NODE 


We now consider the special case of a closed network with an IS 
node. As in Section V, we allow p different job classes. We introduce 
this added complexity here because our algorithm is particularly well 
suited to cope with it. Let each class have its own population and 
routing probabilities. Let there be g + 1 nodes with node q + 1 being 
an IS node and assume that it is visited by every class. (It would 
suffice to have different IS nodes visited by different classes. The 
other nodes visited by any class might include IS nodes too; the 
designated IS node has especially low service rates.) Let yu; be the 
individual service rate of class j at node gq + 1. Let K; be the given 
fixed population of class j, 1 <j < p. (We are now in the setting of 
Ref. 14, except that we are allowing multiserver nodes.) 

We now modify the original closed model by removing the departure 
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processes for the p classes from the designated IS node and replacing 
them by p independent external Poisson arrival processes to the 
remaining qg nodes. Let ); be the rate of the Poisson process for class 
jand let ) = (Aa, «++ , Ap). 

Let N% (A) be the equilibrium steady number of customers of class 
j at node 7 in the q-node open network without the IS node based on 
the Poisson external arrival processes with rate vector ). (We have 
changed the notation somewhat to emphasize the dependence on }.) 
We use the designated IS node to determine the appropriate arrival- 
rate vector ). Since the arrival rates equal the departure rates in the 
open network, the departure rate of class j from the g-node open 
subnetwork is also );. Since these departures all leave from the 
designated IS node, we use the IS node to enforce consistency. In 
particular, we require that the arrival rate be equal to the departure 
rate for each class at the IS node, i.e., 


/ q p 
y= (Ki~ D ENGQ)) w (28) 
for each j. 

Since the expected equilibrium number of customers in a G/GI/o 
model (with general stationary arrival process) is just the arrival rate 
divided by the individual service rate [e.g., see (4.2.3) of Ref. 19], eq. 
(28) would be valid in the original closed model if (1) \ were 
the vector of arrival rates to the IS node and (2) 21 EN%(\) were 
the correct mean number of class j customers in the g-node subnet- 
work. However, 2; EN(()\) is in fact an approximation based on 
both ) and the Poisson assumption. Even if ) were correct, EN‘%()) 
would be an approximation. In Section VIII we show that both 
conditions are satisfied asymptotically if we let K; — %, yu; — 0, and 
Kju; — i; for each j. (This is completely established only for the case 
of one job class, but we conjecture that the convergence is valid for 
multiple job classes too.) Hence, there is reason to expect that the 
procedure will perform well as an approximation for the closed model 
under certain conditions. Interestingly, as indicated in Section I, these 
conditions are the same as those in Ref. 14. 

Equation (28) also coincides with FPM method. With the FPM 
method we approximately solve the original closed model by finding 
the external arrival rate in the associated (q + 1)-node open network 
that makes the expected equilibrium population precisely K; for class 
j. (We now regard the IS node as part of the open network.) However, 
the expected population of class j customers in the IS node is );/p;, so 
that (28) is equivalent to 
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qt+1 


Kj = x EN3()) (29) 


for each j. 

To complete the specification of the FPM procedure here, we must 
identify the vector \ satisfying (28) for all 7. This can usually be done 
iteratively. The key is to recognize that the vector {EN}(\), 1 si sq, 


1 <j Ss p} isa strictly increasing continuous function of (A, --+ , Ap). 
We first bound ); above by 
Uj? = Kjwj, 1S j Sp. (30) 


Then we bound ); below and above successively by 


Lp =(x\- y EN3(U®, UP) 


Ue = = (x) - y EN? (LL, sbehy 1) by (31) 


fork =>1,1 <j <p. It is easy to see that 
Le < Le <rA< Ue < pues (32) 


L® 5 dj, and U\? — ), for dj satisfying (28). [Use the fact that 
EN§(A) is a continuous strictly increasing function of d.] 

To properly initialize the procedure, we must of course have (30) be 
feasible arrival rates; i.e., we need stability: 


EN3(U®, ---, U®) < © (33) 


for all 1 andj. Moreover, we need 


y EN3(U\, ---, US) < K; (34) 
i=1 
for each j to have (81) be feasible rates. Conditions (33) and (34) 
should hold if indeed K; is large, y; is small, and Kj; is not too large. 
If conditions (33) and (34) were violated, we could search for initial 
conditions satisfying the appropriate monotonicity. 

In fact, such elaborate analysis as we have just described is often 
unnecessary. If, indeed, K; is large and yu; is small for each J, with no 
node in the qg-node network in heavy traffic, then it often suffices to 
use the simple formula (30) as the approximation for the external 
arrival rate );. As indicated in (32), the approximation (30) yields 
upper bounds fe A; and EN%(A) for all i and j. Alternatively, the first 
lower bound in (32) j is often a good approximation, see Section VII. 

It is intuitively obvious that the first upper bound (80) for A in (28) 
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is also an upper bound for the vector of throughputs in the original 
closed model, and we prove this in Section VIII. It also seems plausible 
that the equilibrium queue length vector {N# (\)} in the open model 
with J in (30) would be a stochastic upper bound for the corresponding 
random vector in the closed model, and we also prove this for the case 
of a single job class. (The more general case of multiple job classes 
remains a conjecture.) Theorem 1 in Section IV establishes for a single 
job class that the FPM throughput in (28) is a lower bound for the 
throughput in the original closed model. This extends to multiple job 
classes by the remark following Theorem 4. Hence, for each job class 
J, we have 


LY = L® <=); = 69 s 65 s UP. (35) 


Vil. EXAMPLES WITH A DECOUPLING INFINITE-SERVER NODE 


In this section, we illustrate the FPM method in Section VI by 
returning to Example 3 in Section 1.4, which is the central processor 
model treated by McKenna, Mitra, and Ramakrishnan." This is a 
closed cyclic product-form network with two nodes. The first node is 
the CPU, where service is provided according to the processor-sharing 
discipline. Equivalently for our purposes because of insensitivity prop- 
erties,”!®”° the service discipline can be FCFS with an exponential 
service-time distribution. The second node is a think node, which is 
an IS node, representing independent delays at terminals before a job 
is next sent to the CPU. Again, because of insensitivity properties, 
only the mean of the service-time distribution at the IS node matters. 
We shall first consider the case of one job class, which is test problem 
2 described in Table I of Ref. 18. Then we consider two job classes 
and finally we consider the special case of a population of size one to 
show that the FPM method can perform poorly when the approxi- 
mating conditions do not nearly hold. 


7.1 One job class 


The specified model in Ref. 13 is closed with a fixed population size 
(also referred to as degree of multiprogramming). We shall consider 
the associated open model obtained by cutting the arrivals to node 1 
and inserting an external Poisson arrival process with rate X. This 
open model has a very simple solution. To express it, let ui! be the 
mean processing time for each job at the CPU and p3' the mean think 
time (individual service time at node 2). Let p; = A/u1 and az = A/pos 
and assume that p; < 1 to guarantee stability. The equilibrium distri- 
bution in the open model has independent marginal distributions with 
the marginal being geometric at node 1 and Poisson at node 2: 


P(NY = ki, N3 = ke) = (1 — pi)pte7@a#/ke! (36) 
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With the FPM method, we set the expected equilibrium total pop- 
ulation equal to K; i.e., 
P1 r r 


EN? = + as= +—=K, (37) 
1— pi Hi-A pe 








Hence, we obtain the following formula for the external arrival rate i: 
Quit = 1+ x(1+ K) — vil + (1 + K)x]? - 4Kx, (38) 


where x = 2/1. By using a Taylor series expansion in powers of x, 
we see that . 


\/pi = Kx — Kx? — K(K — 1)x? + 0(x°), (39) 


so that \ ~ Kus when po and poK are sufficiently small compared to 
M1. 

In contrast, for the closed model we combine (9) and (386). This is 
conceptually simple, but the calculation can be complicated for large 
population sizes. McKenna, Mitra, and Ramakrishnan’ used this 
example to illustrate the advantage of PANACEA over previous con- 
volution algorithms for the closed model, such as are contained in the 
software package CADS.”' For large population sizes, CADS was 
unable to obtain a solution, while PANACEA obtained a solution 
easily. Moreover, for all population sizes, the throughputs calculated 
by the two methods agree closely. 

It is significant that comparable results can be obtained for this 
example by the FPM method by hand. We do not even need to use 
(38); we can simply use (30) to obtain \ = Kyu2. A comparison of the 
throughput calculations appears in Table I. In this example we have 
small wo (w2 = 1/240 and pw, = 1) and large K (K = 10, 50, 100, and 
200). Since the FPM method throughput provides a lower bound on 
the closed network throughput, the FPM answer is essentially exact 
for K < 100. As in Refs. 13 and 14, the FPM procedure works best 
here if yu, is small, K is large, and Kuz is not too close to yw. Under 
heavy loads, the open-network M/M/1 formula at node 1 keeps the 
throughput down with the FPM method. 

The last two columns of Table I contain the first two and first three 
terms of the Taylor series expansion in (39); the first term of course 
corresponds to the first upper bound in (30), which appears earlier in 
the table. Evidently the algorithm in Section VI converges faster than 
the Taylor series for larger values of K. 


7.2 Two job classes 


We now give additional details for test problem 2 in Ref. 18, which 
differs from Test Problem 1 only by having two job classes. Node 1 
(the CPU) is again a processor-sharing node and node 2 is the IS node. 
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The mean service times for the two classes are 1 and 1.5 at the CPU 
and 450 and 150 at the IS node, respectively. Let ,; be the service rate 
of class j at node 1 and let K; be the population of class j. The first 
upper bounds for the approximate arrival rate, obtained from (30) are 


US? = Kjuyj. (40) 


The associated approximate CPU utilization of class j, say p;, is thus 
p; ~ Kjuo;/m1; and the associated approximate total utilization of the 
CPU, say p, is p © p1 + po. The first lower bounds on the approximate 
arrival rates, obtained from (31), are 


LY = (Kj — pj/(1 — p)) ayy (41) 
The associated approximation for the total CPU utilization is thus 
p= LY /un + LY / mre. (42) 


In this case we only compute the first upper and lower bounds. 
Table IV shows that the approximation procedure works very well. 
We are able to produce results very close to those given in Ref. 13 by 
hand in a few minutes. As suggested in Section VI, the first lower 
bound in (31) and (41) seems to provide a good approximation, even 
though it is a lower bound. From (35), the first upper and lower bounds 
from (40) and (41) are upper and lower bounds on the throughput in 
the closed model. 


7.3 Where the FPM method performs poorly 


When the populations are not large, the FPM method can perform 
poorly. This is easily and dramatically demonstrated with the same 
two-node closed network in which there is a single job. Let node 1 
have one exponential server at rate 1 and let node 2 be the designated 
IS node with individual service rate x. 

The exact equilibrium distribution has the job at node 1 with 
probability x/(1 + x), which is also the associated long-run flow rate 


Table IV—A comparison of the approximation method with exact 
results for the two-class example in Section 7.2 


Numberof Total Utilization of CPU 
Jobs For the Closed Model New Approximation 
(Degree of a a a 
Multipro- From Ref. 13 Version First First 
gramming) oa ee 2.1 of Upper Lower 
Class 1/Class 2 CADS PANACEA PANACEA Bead (40) Bound (41) 
10/10 0.118 0.119 0.121 0.122 0.121 
50/50 0.593 0.60 0.599 0.611 0.598 
100/50 Breakdown 0.69 0.706 0.722 0.704 
200/10 Breakdown 0.54 0.540 0.544 0.540 


1950 TECHNICAL JOURNAL, NOVEMBER 1984 


out of node 2. However, the actual arrival process to node 1 is a 
renewal process in which the renewal interval is the sum of two 
independent exponential variables, one with mean 1 and the other 
with mean 1/x. Moreover, the arrival rate depends dramatically on the 
state. 

When we carry out the approximation procedure, we treat node 1 
as an M/M/1 queue, so that (28) becomes 


A= [1 — A/G. — A)]x, (43) 


which requires \ < 0.5 to have a solution. As x > », d(x) — 1/2. 
Obviously, the approximation does not work well in this case. The 
approximate throughput approaches one-half, while the true value in 
the closed model approaches 1 as x — ~. The normalized difference 
A = (6° — 6°)/u, approaches one-half, which in Section III we conjec- 
tured was the lower bound. 


VIII. SUPPORTING THEORY WITH AN INFINITE-SERVER NODE 


In this section we establish some theoretical results that help explain 
why and when the FPM algorithm in Section VI approximates the 
closed models well. As in Section VI, we assume that there is an IS 
node visited by all classes. We show that the subnetwork of the closed 
Markov network without the IS node approaches an open Markov 
network as the populations increase and the service rates at the IS 
node decrease appropriately (see Theorems 8 and 9). As a consequence, 
we show that the FPM method is asymptotically correct for the closed 
model under these conditions (see Theorem 12). 


8.1 A sequence of closed models 


As in Section VI, there are p job classes and g + 1 nodes with node 
q + 1 being the IS node that is visited by every class. We consider 4 
sequence of systems indexed by the superscript n. Let u? be the 
individual service rate of class j at node q + 1 in the nth system. Let 
K?} be the fixed customer population of class j in system n. As with 
Poisson approximations for the binomial distribution*’ and as in Ref. 
18, the idea is to let K?-—> © and n?— 0 in such a way that K?u?— d; 
for each] asn > ©, 

We let the remaining network structure and parameters be fixed, 
independent of n; neither the total numbers of nodes g + 1 and classes 
p, nor the parameters of the g-node subnetwork change with n. We 
still assume the basic Markov Jackson network structure specified in 
Section I, modified to allow multiple classes, but many of the results 
extend to more general models (see subsequent remarks). 

Let p;; be the probability that a departure of class j from the IS node 
goes next to node 1 in the qg-node subnetwork. (There could be imme- 
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diate feedback to the IS node, which occurs for class j with probability 
L =D De) 

Let Af?(t) be the counting process in the nth closed system repre- 
senting the number of departures of class j from the node q + 1 in the 
interval [0, t] that go next to node 1. Let N§"(t) represent the number 
of class j customers at node i at time t in the nth closed system. Let 
Af?, A", Ni, and N° represent the associated stochastic processes, 
L.e., 


Aj = {Ajr(t), t = 0} 

A® = {A?#;1sjsp,1sisq} 

NF = {N7(é), t = 0} 

NSN late gq) Sy =p}: (44) 


We can initialize the closed networks at time 0 in various ways. For 
example, we could assume that all K7 + --- + K> customers initially 
are at node q + 1. We will later simply assume that the initial 
distributions converge to a proper limit, which includes this situation 
as a special case. 

Let Ij(Aj) = Wy, t) = {y(ry, t), ¢ = 0} be a Poisson counting 
process with intensity \,;, and let HI = IJ (A) be a pq-dimensional 
vector of independent Poisson processes with intensities A = (Aji; 1 S 
J=p,1<Sis4q),ie., 


H(A) = {WyQji), 1 sj sp,1 si s qh. (45) 


Let N%(t) represent the number of class j customers at node i at time 
t in the qg-node open network obtained by deleting the IS node and 
replacing its departure process with the external Poisson arrival proc- 
ess II(A). By “external” we mean that we have the standard open 
model in which future arrivals are independent of the network state 
and history; ie., {ji(Ajz, ¢ + wu) — Wy(Ay, t), u= 0,1 Sisqi1s 
j = p} is independent of {N3(s),s<t,1<si<q,1<j <p} for each t. 
In both the open and closed models, successive service times and 
routings are mutually independent and independent of the history of 
the network prior to their generation. (At this point we are not using 
the FPM method in the open model; the arrival rates are simply 
specified as A.) Let N° be the associated vector-valued stochastic 
process. 

The following theorem expresses how the g-node subnetwork of the 
(q + 1)-node closed network without the IS node approaches a g-node 
open network as n — ~. The convergence of stochastic processes 
described below is convergence in distribution (weak convergence), 
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which we denote by => (see Refs. 28 and 29 and references there). The 
stochastic processes are random elements of the function space 
D[0, ©) = D([0, %), R*). 

Theorem 8: Let dj; = Ajpji for each j and 1. If KP} > ©, p?— 0, Kh? 
\; for each j, and N(0) = N°(0) in R”? as n — &, where N°(0) is 
a proper random vector [P(N%3(0) < ©) = 1 for all i and jj, then as 
n—o 


(a) A® = H(A) 
and 
(b) N" => N°. 


(c) If, in addition, {[N¢(0)]*} is uniformly integrable, then 
E[N#(t)]* > ELN§(6))* 


for each i, j, and t. 

In the proof, as in Section IV, we use the notion of stochastic order. 

One random element (random element of R, R*, D[0, ©), etc.) X, is 
stochastically less than or equal to another X2, denoted by.X; Ss Xo, 
if Eh(X,) < Eh(X,) for all nondecreasing real-valued functions h for 
which the expectations are well defined.** For this a partial ordering 
must be defined on the sample space, which we take to be the usual 
one; e.g., (X41, --+, Xe) S (M1, «++, Ye) in R* if x; < y; for each i and 
{x(t), t= 0} < { y(t), ¢ = 0} in D[O, ©) if x(t) S y(t) for each t. 
Proof: (a) The proof follows Ref. 18, which establishes convergence to 
a Poisson process for the departure process of certain G/GI/o queues 
under similar conditions. The result is not already contained in Ref. 
18 because the arrival process to the IS node here is changing with n. 
However, by Corollary 1 to Theorem 1 in Ref. 18, it suffices to show 
that A = X w, where w(t) = 1, t = 0, as in (3.1) of Ref. 18, and A” is 
the stochastic intensity of the counting process A°”, defined by 


Ag(t) = [KP — NG (t)|utpi, tt = 0. (46) 


(For related theory, see Ref. 30 and references there.) The desired 
weak convergence of A follows easily from (46) because, for any T, 
Sup: Nf (t) Sse NG (0) + Tji(Kfujpi, T), (47) 
where <,, denotes stochastic order defined above and the two quan- 
tities on the right are independent. Since we have assumed that N°"(0) 
converges and that N%(0) is proper and since K7u"p;; > Aj; as Nn > &, 
(47) implies that the sequence {N} is uniformly tight (see p. 37 of 
Ref. 28). This implies that Nf"? => Ow as n — oc and the desired 
conclusion. 
(b) Convergence in distribution of N™ follows by model continuity 
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as in Refs. 31 and 32. In particular, given part (a), we can construct 
versions of A™ and II(\) on the same sample space so that there is 
convergence of the sample paths, using the Skorohod embedding 
theorem.” Using the same service times and routing in all systems, 
we obtain convergence of the sample paths of N“ to N° with proba- 
bility one on the specially constructed space. (Since II(A) has no fixed 
jump points, simultaneous transitions need not be considered.) This 
implies convergence in distribution of the separate stochastic proc- 
esses. 

(c) The stochastic dominance used in part (a) and the new condition 
imply that the random variables {[N¢(t)]*, n = 1} are uniformly in- 
tegrable (see p. 32 of Ref. 28). Part (b) implies that Ni (t) => N?(t) 
as n — © for each i, j, and t. Theorem 5.4 of Ref. 28 thus implies 
convergence of the moments. O 


Remarks: 1. All the conditions on Nj (0) hold trivially if all K? + 
-.+ + Kj jobs are initially at the IS node for each n. 

2. The conditions in Theorem 8 can be relaxed. The g-node subnet- 
work of the closed model can be quite general; e.g., the service-time 
distributions can be nonexponential with FCFS nodes. Our proof only 
exploits the fact that the service-time distribution at the IS node is 
exponential. The service-time distribution at the IS node could be 
made general too, as in Ref. 18, but then we would have to be careful 
with the initial conditions. If the initial residual service-time distri- 
butions at the servers are independent stationary-excess distributions 
of a service-time distribution that has no mass at zero, then part (a) 
holds by virtue of the limit theorem for the superposition of independ- 
ent and identically distributed (i.i.d.) stationary renewal processes.*! 
Of course, if the service-time distribution has positive mass at zero, 
then the limit process is, instead, batch Poisson with geometric 
batches. O 

Theorem 8 implies that the random vectors A(t) and N°"(t) in R?? 
converge in distribution as n — o for each t, but Theorem 8 says 
nothing about the equilibrium distributions. In fact, we have not yet 
ruled out the possibility that the open network is unstable; i.e., we 
could have N?(t) = © as t — ©. Indeed, Theorem 8 is still valid in 
this case, but now we consider the equilibrium distributions. Let 
N°"(o0) and N°(cc) be random vectors with the equilibrium or limiting 
distributions as t — o. (For the continuous-time Markov chains, they 
are necessarily unique.) We assume that the limiting Poisson inten- 
sities \j; are small enough so that the equilibrium or limiting distri- 
bution for N°(t) exists. 


Theorem 9: Assume that a proper equilibrium distribution exists for 
N°. Also, assume that either (1) there is a single job class or (2) the 
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sequence {N°"(), n = 1} is uniformly tight. Then, under the conditions 
of Theorem 8, N™ (0) = N°(o) in R’? asn— », 


We defer the proof of Theorem 9 until we develop some stochastic 
comparison tools, which are interesting in their own right. We are able 
to establish the desired stochastic comparison result (see Theorem 11) 
only when there is a single job class, which explains the second 
assumption in Theorem 9. We conjecture that the required tightness 
in Theorem 9 can be proved from the other assumptions for multiple 
classes. 


8.2 Stochastic comparisons 


For the counting processes A™ and II(A), we use the notion of 
stochastic order based on conditional failure rates or stochastic in- 
tensities, introduced in Ref. 34. The stochastic intensity of the vector- 
valued stochastic process A°(t) = {A$?(t)} is defined in (46). Of course, 
the stochastic intensity of the Poisson process II(A) is the determin- 
istic function Aw. Following Ref. 34, the counting process A is said 
to be stochastically less than or equal to the Poisson process II(A) in 
the sense of conditional failure rates, here denoted by A” <; I(A), if 


A(t) S Aji (48) 


with probability 1 for all j, 1, and ¢ (S; is used in Ref. 34). From (46), 
it is easy to see that indeed (48) is satisfied. Hence, trivially we have 
Theorem 10. 

Theorem 10: If Ku? S }; for each j, then A™ =; T(A). 

Corollary to Theorem 9: In the setting of Section VI, Kju; is an upper 
bound for the expected average throughput for class j over any time 
interval. Hence, the first upper bound for the FPM method in (30) 
yields an upper bound for the long-run throughput of each class in the 
closed method. 

We now establish a general stochastic comparison between N and 

N°. We exploit a coupling or special, almost surely ordered construc- 
tion, as in Refs. 33 and 34. To establish a general comparison result 
for N™ and N°, we assume that there is a single customer class. We 
thus drop the j subscript. We also exploit the fact that the processes 
N°“ and N° are continuous-time Markov processes, but now the service 
rate at node i when there are k customers present can be a general 
nondecreasing function, say y;(k), forl si sq. 
Theorem 11: Suppose that there is a single job class with K"u" s x. Let 
the processes N° and N° be Markov with the service rate functions 
ui(k) nondecreasing in k for each i, 1 Sis q. 

(a) If N° (0) Sst N°(0) in RY, then N™ Sz N° in D[0, &). 
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(b) If, in addition, the equilibrium distribution for the open network 
exists, then also N™(o) <, N°(%) in R%. 
Proof: (a) The argument parallels that of Theorems 6, 7, and 10 in 
Ref. 34. For more details on the method, see Sonderman.*® First, 
Theorem 10 implies that versions of the arrival processors A™ and 
II(A) can be constructed on the same probability space so that the 
points of Aj?(t) form a subsequence of the points in II;;(),i, t) for each 
j (j =1 here) and i (the ordering <2 in Ref. 34). Next we can construct 
the service completions for N™ using the service completions of N°. If 
there is a service completion at node 1 in process N° at time t, then we 
let there be a corresponding service completion at node i in N™ with 
probability u;(N§"(t))/u:(N3t)). When there are service completions 
in both processes, we let the routing be identical. By using induction 
on the transition epochs, we see that this special construction keeps 
the sample paths ordered and the distributions of the individual 
stochastic processes N“ and N° unchanged. 

(b) The stochastic order for each ¢ as a consequence of part (a) is 
preserved in the limit as t — © (see Proposition 3 of Ref. 33). O 


Remarks: 1. It is not difficult to see that Theorem 11(a) is not true for 
multiple job classes. For example, consider a network with two nodes 
plus the IS node and three job classes. Let class j jobs go from the IS 
node to node j and then back to the IS node for j = 1, 2. Let class 3 
jobs go from the IS node to node 2, then node 1 and then back to the 
IS node. Let all service rates be identical at nodes 1 and 2. Let K? and 
K% be large and p{ and yw3 be small so that the arrival processes of 
classes 1 and 3 are both nearly Poisson in the closed model. On the 
other hand, let K§ = 1, so that A$3 is considerably smaller (stochasti- 
cally) than the Poisson process associated with the open model. Let 
nodes 1 and 2 be initially empty. For some relatively short initial time 
interval, say [0, ¢], in the open model there are more arrivals of class 
2 to node 2, with negligible change for classes 1 and 3. These class 2 
jobs at node 2 tend to impede the class 3 jobs at node 2, so that the 
class 3 jobs come to node 1 more slowly in the open model. Hence, 
the class 1 jobs can get through node 1 more easily; thus, we can have 
EN%(t) Ss EN4i(t) even though Ky, S A. 

2. Even though Theorem 11(a) does not extend to multiple job 
classes, we conjecture that Theorem 11(b) does. That would be suffi- 
cient to eliminate conditions (1) and (2) in Theorem 9. 


Proof of Theorem 9: Since N“” and N”’ are continuous-time Markov 
processes with the given equilibrium distributions, we can apply Theo- 
rem 8(b) here and Lemma 1 of Ref. 31. This implies that the desired 
convergence N™(c) => N°(0) holds provided that {N‘°"(), n = 1} is 
uniformly tight. We use the fact that N°” and N° have unique equilib- 
rium distributions. For the case of a single job class, the sequence 
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{N°"(co), n = 1} is uniformly tight by Theorem 11(b). The stochastic 
dominance implies the desired uniform tightness because each indi- 
vidual probability measure is tight (see Theorem 1.4 of Ref. 28). O 
Remarks: 1. Of course, Theorem 9 applies to other non-Markov prod- 
uct-form models that have the same equilibrium distributions by virtue 
of insensitivity properties.**!*,° 

2. Theorem 9 also holds for more general service-time distributions 
in the qg-node subnetwork provided that we can establish the uniform 
tightness. The original processes N° and N° can be made Markov by 
appending supplementary variables. 

3. If the service-time distribution for class j at the IS node is phase 
type instead of exponential, then Theorem 10 remains valid with 
Aji = KPuf", where yj” is the maximum phase service rate for class j. 
If the open network process N° is stable with the high intensities A*, 
then we can apply the analog of Theorem 11 to obtain the tightness 
needed in Theorem 9 (again for a single job class). 

We now show that the first lower bound for the FPM method in 
(31) is a lower bound for the throughput in the closed network. As 
stated, this follows from (32) and Theorem 1, but we make stronger 
comparisons using the stochastic intensity A of the arrival process 
A, defined in (46). 


Corollary to Theorem 11: Suppose that there is a single job class with 
K"u" = i. Then, for each i and t, 


(a) Ag?(t) 2s [K” — N°(t)]u"pii 
and 
(b) EAS (t) = ELK” — N°(t)]u"py. 


If, in addition, both systems are in equilibrium, then 


stt 
(c) E {eo | rwaul > LY for all s and t 


and 
(d) 0 = L®. 


8.3 The FPM method is asymptotically correct 


We now apply Theorems 8 and 9 to deduce that the FPM method 
in Section VI is asymptotically correct. Due to the second assumption 
in Theorem 9, we only completely treat the case of one job class. Let 
A” and N°” be the vector-valued arrival process and queue length 
process obtained by using the FPM method with the nth closed model. 


Theorem 12: Under the conditions of Theorem 8, as n > %, 
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(a) A” = (A), 


(b) N“ = N°, 
and 
(c) E(N$"(t))* > E(N3(t))* 


for each i, j, k, and t. 
(d) Under the conditions of Theorem 9, for sufficiently large n, 
N°"(0) exists as a proper random vector and N° (0) = N°(c0) in RP?. 


Proof: (a) Since A®” is a Poisson process for each n, it suffices to show 
that the associated arrival rates converge. For this, it suffices to show 
that the difference between the first lower bound in (31) and the upper 
bound in (30) is asymptotically negligible, which is immediate under 
the conditions of Theorem 8. Parts (b) and (c) follow exactly as in 
Theorem 8. Theorems 10 and 11 extend easily when A°” and N” 
replace A“ and N“™ since the limiting system is the first upper bound 
for the FPM method. Finally, part (d) follows exactly as in Theorem 
9. O 


IX. A BOTTLENECK NODE WITH A LARGE POPULATION 
9.1 A different approximation procedure 


In this section we observe that the methods and results of Sections 
VI through VIII also apply, after appropriate modification, to closed 
networks with a bottleneck non-IS node. We first consider the case of 
one job class. For large populations, all servers at the bottleneck node 
will usually be busy, so that we can approximately analyze the original 
closed model by using the bottleneck node to decouple the network 
just as we used the IS node in Section VI. We remove the bottleneck 
node and replace its departure process by an external arrival process. 
We then solve, exactly or approximately, the resulting open network. 
If there are s servers at the bottleneck node, then the external arrival 
process would be the superposition of s i.i.d. renewal processes each 
having the bottleneck service-time distribution as the renewal-interval 
distribution. The routing of the external arrivals is just the original 
routing from the bottleneck node. When the service-time distribution 
at the bottleneck node is exponential, the approximating external 
arrival process is thus Poisson. Otherwise, we would approximately 
characterize the external superposition arrival process as in Ref. 7 and 
apply the algorithm there to approximately analyze the resulting non- 
Markov open network. 

In this setting the bottleneck node is easy to identify. As in Section 
III, we begin by replacing one internal arrival process by the external 
arrival process with a rate sufficiently small to ensure stability. Let ); 
be the net arrival rate to node i obtained from solving the traffic rate 
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equations with the given external arrival rate, say \,. Then calculate 
the traffic intensity at node i as p; = ;/s;u; given that node 1 is a FCFS 
node with s; servers, each working at rate y;. The node with the highest 
traffic intensity is the bottleneck node; call it node q + 1. We assume 
that there are no ties. The capacity of the network is thus Ss 941441. We 
can achieve any throughput less than s,414,+1 in the open model. The 
traffic intensity becomes 1 at node gq + 1 at the capacity, which makes 
the system unstable. It of course is well known that sg+iyg+1 is an 
upper bound on the throughput even in non-Markov networks (see 
Ref. 11 and references there). 

The proposed approximation procedure for the closed model with a 
large population is to solve the traffic rate equations for the associated 
open network and find the bottleneck node, which we denote by node 
q + 1. Then solve the open model obtained by deleting node q + 1 
from the closed network and inserting an external arrival process with 
rate $q+1/g+1- However, unlike Sections III and VI, we do not use the 
FPM method for the full (g + 1)-node network; we do not require a 
consistency condition such as (28). We simply let the approximate 
number of jobs at node g + 1 in the original closed network be 


q 
EN$41() © K — Y) EN@(sq+1tqs1)- (49) 
i=1 


If there are quite a few nodes but not a large population, we will use 
the original FPM method, but as the population grows with the number 
of nodes fixed, the effect of the bottleneck node becomes more pro- 
nounced. 


9.2 Limit theorems 


The approximation procedure just described is evidently quite well 
known. Supporting limit theorems are discussed by Whittle® and 
Brown and Pollett.*® The methods and results of Section VIII provide 
a convenient way to prove that the approximation procedure is asymp- 
totically correct for the g-node subnetwork excluding the bottleneck 
node as the population grows. Since the results and methods are 
similar to those in Section VIII, we only give a brief account. The 
analog of Theorem 8 is for one customer class. We let Ki — © as 
before, but now we fix 7, the individual service rate at node gq + 1. 
When the service-time distribution at the bottleneck node is exponen- 
tial, we can use the obvious modification of the proof of Theorem 8(a). 
With general service-time distributions, it is easy to show that the 
probability that all servers are busy at the bottleneck node throughout 
any interval [0, t] converges to 1 as n — . The rest of Section VIII 
applies in a straightforward manner, with essentially the same remarks 
about generalizations. 
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9.3 Multiple job classes 


We now consider multiple job classes with a special bottleneck node. 
We assume that there is a single-server processor-sharing bottleneck 
node with fixed total service rate « whenever any customers are 
present. Let the service requirements of class j at the bottleneck node 
be exponentially distributed with mean yj". Let the population of 
class j in the network be Kj. 

Again the approximation is obtained by replacing the bottleneck 
node by an external Poisson process with rate yu: Each of these external 
arrivals is from class j with probability 


1; = Kjuj/(Kims + +++ + Kopp). (50) 


Consequently, as in Section VI, there is a pq-dimensional vector of 
independent Poisson processes with the intensity class j going to node 
i being Aji = wy;pji- The limit theorems in Section VIII also apply here. 
As before, Theorem 11(a) does not hold for multiple job classes. 


9.4 Another stochastic comparison 


We now make a stochastic comparison between the closed model 
and the open model resulting from the bottleneck approximation. We 
consider the case of one job class. We compare the g-dimensional 
equilibrium distribution of the subnetwork of the closed model without 
the bottleneck node to the g-dimensional equilibrium distribution in 
the g-node open model with external arrival rate 9415911. We show 
that the equilibrium distribution based on the bottleneck approxima- 
tion is larger in a very strong sense, namely, in the MLR ordering 
used in Section 4.3. 

Let (N{, --- , NG) be the equilibrium random vector in the closed 
model with population K without the bottleneck node, defined in 
terms of an associated (q + 1)-dimensional open-model equilibrium 
random vector (N%, --- , NG+1) by 


P(NG = hi, +++, NS = ha) 
q 
P(NG = hy) --- P(N3 = hy) P(N3n1 = K — Y 8) 


j=0 
= a 
P(N? = k) eh) 
for (ki, --- , k,) such that ki + --- +k, sK. 
Let (N%, .--, N°) be the open-model equilibrium random vector 
with utilization at node i of u?/u}41, where u? is the utilization of node 
tin (N%, «++ , No+) in (51). We assume that u? < u44, for all i. This 


is tantamount to having an external Poisson arrival process with rate 
Ug+1Sg+1 in the q-node open network. 


Theorem 13: (N4, --- , NS) Sir (N%, --- , N%). 
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Proof: It is immediate that the distribution of (N2, --- , N°) is MTP, 
because the marginals are independent (see Proposition 3.5 of Ref. 
50). Consequently, by Theorem 3 of Ref. 48, it suffices to show that 
Pil y) po(x) S pi(x) poy) for all x < y, where p; and pz are the associated 
probability mass functions. Moreover, it suffices to consider y differing 
from x by 1 in only one place, e.g., x = (ki, ---, kR,) andy = (k, + 1, 
ko, --+ , Ra). We verify this as follows: 


P(NG = ki +1, -+-, NS = ke) P(NS = ky, «>. , NB = ke) 
P(N§ = ki, ---, NS = ke) P(N? = ky +1, ---, N3 = ke) 


q 
Pivt=h. + DP( qt = K- » ky 1) Pwt= hp 
j=l 


= ms se 

q 
P(N{ = ky) P (Nee =kK-Y i) P(N§ = k, + 1) 

j=l 

because, for all j, 

P(N} =j + 1)/P(NG = j) = uj PING = j + 1)/P(NG = J) 
and 
P(NQH =J + 1)/P(NGH1 = J) = get (52) 


To verify (52), recall that N%i: has the equilibrium distribution of a 
birth-and-death process, so that 


\yP(Noe1 = J) = dj P(NGa = 7 + D, 


where dj is the arrival rate when N%4, = J, which is independent of j, 
and jij+1 18 the service rate when N41 = j + 1. When there is one 
server, U41 = Aj/fj+1 for all j, but in general, fj41 = wg+1 min{j + 1, s}, 
so that u34. <Aj/fju1. O 

Remarks: 1. Theorem 13 has corollaries like those for Theorem 6. For 
example, 


P(NG = k;|a; = N§sb;)) <= P(N?=Rlajs N?sd;) (53) 


for all i, j, k;, a;, and b;. Inequality (53) is interesting both when i = j 
and i #j. Of course, when i ¥ j, the right-hand side of (53) reduces to 
P(N?= kj). 

2. As in Section VIII, we can obtain results for the equilibrium 
distribution associated with other queue disciplines by invoking insen- 
sitivity properties.*®!%° 

3. Algorithms for identifying bottleneck nodes and treating them 
are described by Schweitzer™* and Goodman and Massey.” Stochastic 
bounds for open networks of single-server nodes are contained in 
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Massey.*® These bounds apply to closed networks too by combining 
them with the comparison results in this paper. 

4, As the population increases, the closed network can be said to be 
in heavy traffic. However, only the bottleneck node accumulates jobs 
in the limit. The number of jobs at the nonbottleneck nodes is 
asymptotically negligible compared to the number at the bottleneck 
node. In fact, by the analog of Theorem 9, the number of jobs at all 
nonbottleneck nodes, unnormalized, converges to a proper limit, as 
the population grows. Instead of the complicated multidimensional 
diffusion process approximations for networks of queues described in 
Reiman,” we have significant accumulation of customers only at the 
bottleneck node alone. The situation here is an example of the diffu- 
sion approximations with state space collapse discussed by Reiman.”® 
However, because we are considering a closed network, the number of 
customers at the bottleneck node is best described by K — 31%, N?. 
Indeed, as a trivial corollary to the analog of Theorem 9, we have 


q 
(Ne — K") => & Nj (54) 
pe 
as n — ©, Unless there are ties for the maximum traffic intensity, 
only one node will be a bottleneck node for both closed and open 
networks. Moreover, because of the geometric tails of the queue length 
equilibrium distributions in Markov networks, slight differences in 
traffic intensities will rapidly lead to large differences in the queue 
lengths as the population grows. Consequently, the case of a single 
bottleneck node treated here seems most relevant for applications. 


X. APPROXIMATIONS FOR NON-MARKOV CLOSED NETWORKS 


10.1 Several possible approximation procedures 


Suppose, as in Ref. 7, that the Markov property is lost because we 
are considering FCFS nodes with nonexponential service-time distri- 
butions. There are two natural procedures for calculating approximate 
congestion measures for such non-Markov closed networks based on 
previously developed approximations for non-Markov open networks. 
Just as we can use Markov open models to analyze Markov closed 
models, we can use the approximate solution for an associated non- 
Markov open model to generate an approximate solution for the given 
non-Markov closed model. 

The first procedure for non-Markov closed models starts with the 
approximate equilibrium distribution of the number of customers at 
each node in the associated open model, as described in Section III. 
Then the corresponding equilibrium distribution for the closed model 
can be obtained by conditioning as in (9). For the open model, the 
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standard approximation procedure is to use a product-form solution 
(an equilibrium distribution with independent marginal distributions). 
This is the procedure first suggested by Reiser and Kobayashi.®’ The 
Extended-Product-Form (EPF) method of Shum and Buzen*®*? and 
the Generalized-Product-Form (GPF) method of Tripathi® are also 
variants of this approach. A complete approximation thus is deter- 
mined by specifying the equilibrium distribution of the number of 
customers at each node in the open model. For example, with QNA’ 
this can be done by fitting a discrete distribution to the quantities 
P(N? = 0), E(N9), and Var(N2), which are currently provided in the 
model solution. In fact, in Ref. 7 an approximation for the waiting- 
time distribution at each node is obtained in this way. For single- 
server nodes, it is natural to use mixtures and convolutions of geo- 
metric distributions for the conditional distribution of the number of 
customers at each node, given that the server is busy. Such an 
approximation procedure based on QNA is currently being investi- 
gated. 

There are some difficulties with this first procedure, however. We 
must do the same extensive calculation to find the normalization 
constant G as we do with the Markov closed model, so that we obtain 
no reduction in computation working with approximations. We can of 
course use many of the same algorithms now being used for Markov 
closed networks.® 

The second procedure is to use the open model directly, as with the 
FPM method. We believe that this method can be expected to work 
about as well as it does for closed Markov models. Now, in the setting 
of Ref. 7 we also have variability parameters. In particular, we must 
specify a variability parameter as well as an arrival rate for the special 
new external arrival process. 

There are three different situations. First, with a decoupling IS node 
containing most of the customers (under the conditions of Section 
VI), it is natural to use the FPM method and approximate the external 
arrival process by a Poisson process, so that there is no problem 
selecting the variability parameter; set it equal to 1. However, now it 
is important that the external Poisson arrival process replace the 
departure process from the special IS node. This process will be 
approximately Poisson, even with nonexponential service-time distri- 
butions. 

Second, with a bottleneck node having s servers as in Section IX, it 
is natural to regard the arrival process as the superposition of s 
independent renewal processes each with the bottleneck service-time 
distribution as the renewal-interval distribution. When s = 1, the 
procedure is clear: use the squared coefficient of variation of the 
bottleneck service-time distribution. When s > 1, we can use approx- 
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imations for superposition processes as in Section 4.3 of Ref. 7. As 
described in Section IX, we would not use the FPM method, but 
instead the open model with the bottleneck node removed. 

The third situation is where the FPM method is appropriate but 
the variability parameter needs to be determined. In Section 10.2 we 
discuss this case in detail. 

There are of course many other procedures for approximately ana- 
lyzing non-Markov closed networks with nonexponential FCFS 
nodes,®1!22!-©2 hut we do not discuss them here. 


10.2 The FPM method for non-Markov models 


A simple procedure for the FPM method more generally, in the case 
of a single job class, is to first specify an external arrival rate \) and 
then, for that specified arrival rate, solve a system of linear equations 
to obtain the variability parameter c2(\o) that makes the variability 
parameter of the departure process from the network equal to the 
variability parameter of the external arrival process. (The reason for 
doing this, of course, is that in the closed network these two processes 
are actually the same process.) We then solve the open model for a 
range of possible external arrival rates, associating c3(Ao) with A» each 
time. As before, the throughput when the population is K is the value 
of \) such that EN° = K. 

We now describe in detail a modification of the QNA algorithm in 
Ref. 7 that has been developed to approximately analyze a closed non- 
Markov network of queues with one job class by the FPM method. 
The initial model is just as in Ref. 7 but without external arrival 
processes. In particular, the nodes have the FCFS discipline, several 
servers, and general service-time distributions. We assume that the 
reader is familiar with Ref. 7, and we use the same notation here. 

The model input is a minor modification of the standard input in 
Section 2.1 of Ref. 7; we just omit the data for the external arrival 
processes. For each network we specify the following: 


n =number of nodes in the network 

m,; = number of servers at node j 

7; = mean service time at node j 

cz, = squared coefficient of variation of the service-time distribution 
at node j 

qi; = proportion of those customers completing service at node i that 
go next to node j. 


To apply the FPM method, we introduce an external arrival process 
to one node, which we stipulate is node 1. To see how the expected 
network population depends on the external arrival rate, we specify a 
set of external arrival rates, which are understood to apply to node 1. 
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The set is specified by the following numbers: 


L = lower bound for external arrival rate to node 1 
U = upper bound for external arrival rate to node 1 
C = number of different arrival rates. 


Given the triple (L, U, C), the network will be analyzed C times 
with the following external arrival rates to node 1: 


Jo = L + R(U — L)/(C — 1) (55) 


fork = 0,1, ---, C—1. The external arrival rates to all other nodes 
are zero. To obtain the open model for the FPM method in each case, 
we insert an external arrival process to node 1 with one of the rates 
specified in (55) and we eliminate all internal arrivals to node 1. This 
is done with the algorithm by setting q;, = 0 for all 1. 

We begin by solving the traffic-rate equations, given the external 
arrival rate \o1, exactly as in Section 4.1 of Ref. 7. This provides the 
traffic intensities at the nodes, needed for the traffic variability equa- 
tions. 

Next we solve the traffic variability equations. The algorithm also 
determines the variability parameter cé, for the external arrival process 
to node 1. As indicated above, the idea is to have the variability 
parameter of the external arrival process agree with the variability 
parameter of the total departure process from the network (which - 
would have been the arrival process to node 1 in the closed model). 


The equations in (24) of Ref. 7 are valid for j = 2, --- , n; i.e., we have 
ci, =a t > caibi, 2 <j =n, (56) 
i=1 


with a; and b; in (25) and (26) of Ref. 7. Since gi; = 0 for all i, c2; = 
ci). The variability parameters are solved by replacing the first equa- 
tion in (24) of Ref. 7 with 





ch = at +S ckbh, (57) 
where 
=1tut {143 cave] 3 Qij 
+ ( -> a) pi ( +o =|, (58) 
bi = wi (d;/d) (.- = a) a pi); (59) 
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d; is the departure rate from the network at node i, d=d, + --- +d, 
as in (23) of Ref. 7, and wf is the superposition weighting function in 
(29) of Ref. 7 with p,; in (30) there replaced by d;/d. 

We derive (57) through (59) as follows. First, the departure process 
from the whole network is the superposition of the departure processes 
(leaving the network) from the separate nodes. Hence, by Section 4.3 
of Ref. 7, 


c2, = wy (5 (aay) +1-uf, (60) 
i=1 ; 


where c#? is the variability parameter for the departure process from 
the network at node i; i.e., 


ca? = ( a a) ca + D gi (61) 
j=2 j=2 
and 
o? 
cj,=1+ 1- 4(c?2; -— 1 + —= (c?, — 1), 62 
d (1 — pi)( ) a ) (62) 


using first the splitting formula (36) and then the departure formula 
(39) from Ref. 7. 

The rest of the modified QNA algorithm is just as in Ref. 7. We 
next calculate the congestion measures at the nodes using the traffic 
rate and variability parameters already determined. By running the 
algorithm a few times with various (L, U, C) triples, the user can 
easily select a set of external arrival rates to node 1 via (55) to yield a 
desired range of expected network populations in the open model. The 
algorithm also can automatically find the external arrival rate yielding 
a specified expected equilibrium network population. 

We can also use the finite-waiting-room refinement introduced in 
Section 1.3 to calculate the congestion measures at the nodes in the 
open model. For single-server nodes, we use the modifications in (2) 
and (3), even if the nodes do not correspond to M/M/1 models. 
Remarks. 1. Our procedure above replaces an internal arrival process 
to node 1 by an external arrival process. Instead, we could have 
replaced the internal departure process from node 1 by an external 
arrival process. It would then have been immediately split according 
to the routing probabilities qg,;. The original departures from node 1 
would then be removed. 

2. As mentioned in Section I, it may be desirable to artificially 
deflate the variability parameters of the arrival processes. It is natural 
to do this after the traffic variability equations have been solved as 
described above. 
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3. The procedure can easily be extended to multiple job classes in 
various ways. For example, we can let the variability parameters of 
the external arrival processes for each individual class be unspecified. 
We then can apply the procedure in (56) through (59) in this section 
to specify the variability parameter for the overall external arrival 
process in the aggregated single-class network obtained from Section 
2.3 of Ref. 7. The only remaining complication is that instead of the 
external arrival rates \o; determined by the single triple (L, U, C), we 
now have a vector of external arrival rates determined by such a triple 
for each job class. Automatic search obviously becomes desirable in 
this setting. 

4, The approximate solution using the FPM method can be fruitfully 
combined with the exact solution of the corresponding Markov model 
to obtain improved approximations for the closed non-Markov model. 
For example, we can solve the closed Markov model to obtain wu as 
the utilization of node 1 when the service-time distributions are all 
exponential. We can also apply the FPM method twice, once with 
general service-time distributions and once with exponential service- 
time distributions, to obtain corresponding utilizations uw? and u?™. 
We can then approximate u%°, the utilization at node i in the closed 
network with nonexponential service-time distributions, by 


use = uM /yoM, (63) 


Since (65) can lead to inconsistencies such as u{¢ > 1, it is natural to 
use 


(1 — uf) = (1 — u™)(L — u?%)/(1 — u™) (64) 


for the node i with the largest utilization. We then calculate u{° for 
the other nodes using (15), which is justified for non-Markov models 
as well as Markov models, e.g., by Little’s formula, (4.2.3) of Ref. 19. 
At least, the ratios u?@/u?™ and (1 — u?“)/(1 — u?™) can give a rough 
idea of how much the nonexponential service-time distributions mat- 
ter. 


XI. THROUGHPUT BOUNDS IN NON-MARKOV CLOSED NETWORKS 


It is sometimes claimed that closed Markov models are suitable even 
when the service-time distributions are not exponential. In particular, 
it is sometimes claimed that the utilizations and throughputs, at least, 
do not depend critically on aspects of the service-time distributions 
beyond their means. Of course, this is trivially true for certain special 
service disciplines such as processor sharing, for which there are 
insensitivity results,?°”? but with FCFS nodes the service-time dis- 
tribution matters. Even with FCFS nodes, there is significant justifi- 
cation for this view if the service-time distributions do not depart too 
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drastically from the exponential distribution. However, in general, 
throughputs obtained with the Markov model can be very bad approx- 
imations, as we show in this section. For example, for a cyclic network 
with n single-server nodes, equal mean service times, and K customers, 
we show that the set of possible utilizations for each server is the 
interval (n~, 1] for all K = n, whereas the utilization is K/(n + K — 
1) in the Markov model by (14). For large n and K and arbitrarily 
unfavorable service-time distributions with given means, the Markov 
approximation can be arbitrarily bad. The true value can be arbitrarily 
close to 0, while the Markov approximation is arbitrarily close to 1. 

We consider the same non-Markov closed model as in Section X, 
containing FCFS nodes with general service-time distributions, but 
we restrict attention to single-server nodes. All the service times are 
assumed to be mutually independent and the service times at any 
given node are identically distributed. There is a single job class with 
K jobs. A job completing service at node i is routed immediately to 
node j with probability g,, independent of the history. The matrix 
Q = (qj) is a Markov chain transition matrix, which we assume is 
irreducible. Consequently, there is a unique equilibrium distribution 
associated with Q, defined by 


jj = > NQij » l<j<n, (65) 
j=l 


with A; + --- + A, = 1. By the law of large numbers, ); is the long- 
run fraction of transitions that each customer spends in node j. The 
system of equations (65) is also the basic traffic-rate equations for the 
network of queues. The throughput or equilibrium flow rate through 
node 1, say 6{, is proportional to X,, i.e., 0 = yA; for some constant y. 

Let 7; be the mean service time (which we assume is finite and 
strictly positive) and let u{ be the utilization (long run fraction of time 
that the server is busy) at node i. By Little’s law, (4.2.3) of Ref. 19, or 
by the law of large numbers again, we know the ratio of the utilizations, 
Le., 


ui/uj _ Ait ifAj7; (66) 


for any i andj just as for the Markov model in (15). 

We exhibit the infimum of the server utilizations possible for 
service-time distributions with the given means. As will soon be clear, 
the infimum is approached by quite unusual service-time distributions, 
so that we do not rule out the possibility that the Markov model can 
provide good throughput approximations for typical nonexponential 
service-time distributions. The idea for minimizing utilizations is 
really quite simple. For our model, at any time at least one server must 
be busy. Hence, the sum of the server utilizations must exceed unity: 
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y uf >= 1. (67) 


A lower bound on the server utilizations is the case in which there is 
no concurrency, i.e., no two servers are ever busy at the same time. 
This lower bound is obviously valid in much greater generality. It is 
also attained in the case K = 1. It is somewhat remarkable that this 
lower bound is actually approached for any K with the general inde- 
pendent service times allowed here. This observation was apparently 
first made by Arthurs and Stuck." 

It is, in fact, not difficult to attain this lower bound asymptotically 
by considering special sequences of service-time distributions with a 
common mean that get successively more variable. In particular, for 
m = 1, let X,, be a random variable distributed as 


P(Xm = m) = 1 — P(Xp = 0) = m7. (68) 


[Alternatively, (X,,| Xm > 0) could have some other distribution with 
mean m, such as exponential.] Then let the service time at node i be 
distributed as 7;Xm. 

Theorem 14: (a) The infimum of the possible utilizations of server i for 
this closed network model over all service-times distributions with 
specified means is 


n 
inf u; = Aj; > NjTj- 
j=1 


(b) If the service times are not all deterministic, then the infimum is 
not attained for K > 1, but is approached asymptotically for all nodes 
simultaneously as m —> © using the service-time distributions of t:Xm 
described above. 

Proof: (a) We informally sketch the proof. For very large m, occasion- 
ally (among all service times generated) a long service time occurs at 
some node. With high probability, thereafter all the other customers 
instantaneously fly around the network until they arrive at this node, 
where they all wait together in queue. (There is only one server at 
each node.) The only other possibility, which occurs with asymptoti- 
cally negligible probability as m — ©, is that one of the other customers 
encounters another nonzero service time before all of the customers 
are gathered together at the same node. This event yielding the 
concurrency has asymptotically negligible probability because the 
distribution of the number of transitions for any job to go from any 
node i to any other node j does not change with m. Hence, for each of 
the K — 1 customers there is a fixed random number of trials (with 
finite mean and variance) to generate a new nonzero service time, but 
the probability of doing so on each trial is m7’. Hence, the proportion 
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of time during which two or more servers are simultaneously busy 
converges to zero as m —> ©, 

(b) On the other hand, it is trivial that concurrency cannot be ruled 
out altogether when K > 1 and there is some randomness. For any 
model with strictly positive expected service times, at least one non- 
deterministic distribution, and an irreducible routing matrix, concur- 
rency occurs with positive probability. The limiting case above is not 
legitimate because X,, converges in distribution to the random variable 
X with P(X =0)=1. O 
Remarks: 1. It is not necessary to have all service-time distributions 
be of the special form (68). It suffices to have all but one. The other 
one can be arbitrary. At this designated node there will be a succession 
of ordinary service times after which the customer usually returns 
immediately to the end of the queue. When a customer does get a 
nonzero service time elsewhere, the others get there relatively quickly 
with high probability when they complete service. It is again not 
difficult to show that the proportion of time that there is concurrency 
is asymptotically negligible. 

2. A next step would be to obtain tighter bounds under extra 
conditions as, for example, in Refs. 63 through 65 and references there. 
However, as noted above, the special service-time distributions can be 
Hz (hyperexponential: a mixture of two exponential distributions), so 
that does not help. It would obviously help to fix the variance though. 
We conjecture that the X,, distributions would yield the minimum 
then. 

3. The infimum decreases rapidly as the number of nodes increases. 
The possible server utilizations are not so great with two nodes. For 
example, suppose that gi2 = gei = 1, which makes the network cyclic. 
Then \; = Ag = 1/2 and inf u, = 7;/(7; + T2). In this case the maximum 
is 1, which is attained with the deterministic service-time distributions 
for K = 2. As before, the infimum corresponds to the case K = 1. In 
the case of balanced loads, the utilization of each server must lie 
between one-half and one. It is useful to recall that this two-server 
cyclic network is equivalent to an M/G/1/K—1 queueing model when 
one of the two service-time distributions is exponential (see Ref. 66, 
p. 33 of Ref. 67, and Ref. 2). The M/G/1/K—1 model means that we 
have an external Poisson arrival process, a single queue with one 
server and an additional waiting room of size K — 1. Since we approach 
the infimum if all but one service-time distribution is of the special 
kind, the infimum is also valid for M/G/1/K—1 queueing model. 


XI. MORE NUMERICAL COMPARISONS 


We have indicated that the approximation methods should perform 
better for larger closed networks, but it is nevertheless useful to 
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compare their performance for smaller ones. It is certainly important 
to realize the limitations of these procedures. They often perform 
poorly for small networks. 

It is particularly convenient to consider two-node closed networks 
because these networks are equivalent to special single-node models 
that have been studied extensively and for which there are tables of 
exact values. 


12.1 Two single-server nodes 


As we noted in Section XI, the closed model with K jobs (all of one 
class) and two single-server nodes, one of which has an exponential 
service-time distribution, is equivalent to an M/G/1 model with a 
finite waiting room of size K — 1. Similarly, the two-node closed model 
with K jobs and one IS node having exponential service-time distri- 
butions is equivalent to finite-source M/G/1 model with K sources. 
Tables for M/G/1 models with finite waiting room and finite sources 
are contained in Ref. 67, for example. 

Table V here displays the exact values and various approximations 
for the throughput and the expected equilibrium number of jobs 
present in an M/G/1 model having a waiting room of size 10. This 


Table V—A comparison of exact throughput and mean number in 

system in the M/G/1/10 model having a finite waiting room of size 

10 with approximations based on the bottleneck method and the 
FPM method for G = M, D, and Hz 


FPM 
Arrival Exact Values Bottleneck Method Method Predicted 4 
Rate ° EN, @° EN, 0° EN, With (64) 
(a) Exponential Service Times (M) 
0.50 0.4999 1.00 0.500 1.00 0.455 0.84 
0.75 0.7418 2.61 0.750 3.00 0.674 2.07 
1.00 0.9167 5.50 1.000 0 0.846 5.50 
1.40 0.9928 8.72 1.000 8.50 0.902 9.19 
2.00 0.9998 10.00 1.000 10.00 0.910 10.16 
(b) Deterministic Service Times (D) 

0.50 0.5000 0.75 0.500 0.75 0.461 0.65 0.500 
0.75 0.7494 1.85 0.750 1.88 0.695 1.43 0.744 
1.00 0.9538 5.57 1.000 00 0.912 4.93 0.952 
1.40 0.9998 9.61 1.000 9.39 (9.69) 0.973 9.68 0.998 
2.00 1.0000 10.37 1.000 10.25 (10.61) 0.987 10.38 1.000 

(c) Hyperexponential Service Times With c? = 2.25 and Balanced Means 
0.50 0.4983 1.27 0.500 1.31 0.449 1.05 0.500 
0.75 0.7248 3.01 0.750 4.41 0.651 2.71 0.739 
1.00 0.8829 5.34 1.000 oo 0.787 6.00 0.885 
1.40 0.9791 8.15 1.000 7.38 0.833 9.06 0.987 
2.00 0.9985 9.75 1.000 9.69 0.840 10.10 1.000 
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corresponds to a closed network with a population of 11 and two 
single-server nodes. Three service-time distributions are considered: 
exponential, deterministic, and hyperexponential (mixture of two ex- 
ponentials). The hyperexponential distribution has squared coefficient 
of variation c? = 2.25 and balanced means (see p. 8 of Ref. 67 or 
Section 3 of Ref. 68). The service time is set equal to 1 and five arrival 
rates are considered: 0.50, 0.75, 1.00, 1.40, and 2.00. The arrival rate 
of 2.00 (1.40) corresponds to a traffic intensity of 0.50 (0.71) when the 
nodes are switched. In each case, the FPM and bottleneck approxi- 
mations are displayed in addition to the exact values. For the nonex- 
ponential service times, the refinement in (64) is also displayed. 

The exact values come from Tables 5.1.6, 5.2.6, and 5.4.12 in Section 
II.5 of Ref. 67, using the FIFO or FCFS discipline. The approximations 
are obtained using the GI/G/1 formulas (47) and (44) in Ref. 5 with g 
in (45) of Ref. 5 set equal to 1. The approximate values using the 
Kramer-and-Langenbach-Belz correction term in (45) of Ref. 7 are 
given in parentheses to the right of the other values in Table V (b). 

There are several important conclusions to draw from Table V. 
First, as we should expect from Section III, the FPM method performs 
poorly, much worse than the bottleneck method. However, it is im- 
portant to remember that this small network tends to be a worst case 
for the FPM method. It is also significant that the refinement sug- 
gested in (64) produces quite accurate results. With this job population 
(waiting room size), the bottleneck method seems to work reasonably 
well as long as the utilization of the bottleneck node is no more than 
about 0.75. 

It is also useful, to consider the Finite-Writing-Room (FWR) re- 
finement introduced in Section 1.3 in the context of Table Va. When 
combined with the bottleneck procedure, the FWR refinement ob- 
viously makes the approximation exact when n = 2. The FPM method 
with the FWR refinement is also exact in the special case of equal 
service rates. When n = 2, we must have EN? = EN? = K/2 by the 
FPM method. With the FWR method, this implies that p; = 1 and 
P(N? = K) = 1/(K + 1), so that 7? = uf = K/(K + 1) > K/(K + 2) = 
uf, using (11) and (14). However, more generally the FPM/FWR 
method does not perform well, at least for the mean numbers at each 
node, when n = 2. For example, when the arrival rate is 0.50, the 
FPM/FWR approximations are §° = 0.69 and EN, = 2.17. However, 
applying (3), we obtain @*= 0.495 as a lower bound on the throughput. 

From Table Vb we see that the bottleneck method continues to 
perform well for the deterministic service-time distribution. In fact, 
the bottleneck approximation is clearly much better than using the 
exact M/M/1 values in Table Va as an approximation, which is often 
what is done in practice. 
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However, from Table Vc we see that the quality of the bottleneck 
approximation deteriorates when we consider the more variable hy- 
perexponential service-time distribution. Of course, the throughputs 
are always close and the mean queue lengths are good when the traffic 
intensity at the bottleneck node is 0.5, but when the traffic intensity 
is 0.70 or 0.75, the open-network view exaggerates the impact of the 
greater variability. The fixed population in the closed model tends to 
damp the effect. 

We can also see what happens as we change the service-time 
variability in Table V. From the exact values, we see that the through- 
put decreases in every case, but the throughputs are never near the 
lower bound in Theorem 14. We also see that the expected number of 
jobs at that node decreases when the arrival rate is greater than 1.00. 
This phenomenon was observed by Bondi® and is further discussed 
in Bondi and Whitt.® Briefly, the explanation is that under moderate 
to heavy loads increased variability in the service-time distribution 
often has a greater impact on other nodes via their arrival processes 
than on the congestion at the given node. It is significant that this 
qualitative behavior is captured by both the bottleneck and FPM 
methods using QNA. However, the FPM method fails to capture this 
bottleneck phenomenon in other cases. As noted in Refs. 61 and 62, 
the bottleneck phenomenon is useful to test procedures for approxi- 
mately solving non-Markov closed networks. 

Approximately characterizing the variability of arrival processes in 
a tightly coupled closed network such as the two-node model being 
discussed is difficult because of the constraint on the total population. 
If the utilization of a server is high, then the interdeparture times are 
distributed approximately the same as the service times, but the 
population constraint tends to induce negative correlations among the 
interdeparture times: several long (short) times are more likely to be 
followed by a short (long) one. Hence, the effective variability of an 
arrival process, e.g., as described by the asymptotic method in Ref. 68, 
is likely to be considerably less in a closed network than in an open 
one. This is the reason for developing heuristic procedures to reduce 
the variability parameters of the arrival processes in the approxima- 
tion method. 


12.2 One single-server node and one infinite-server node 


Table VI displays exact and approximate results for a two-node 
network containing an IS node with exponential service-time distri- 
butions. In this case, the service-time distribution at the single-server 
node is always exponential. It is easy to apply the FPM approximation 
to other cases, but we had no convenient tables. The exact values from 
Table 2.10.7 of Ref. 67 are obtained by specifying the population 
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Table VI—A comparison of exact throughput and mean number in 
system in the M/G/1 queue having a finite source of size 10 with 
approximations based on the FPM method 


Exact First Upper 
Values FPM Bound 
(a) Total Population 10 and Utilization 0.50 
Arrival rate 0.0547 0.0547 given 0.0547 given 
per idle source 
Throughput or 0.500 given 0.494 0.547 
utilization 
Expected number 0.86 0.98 1.21 


in system, EN: 
(b) Total Population 10 and Utilization 0.75 


Arrival rate 0.0927 0.0927 given 0.0927 given 
per idle source 

Throughput or 0.750 given 0.705 0.927 
utilization 

Expected number 1.91 2.39 12.70 


in system, EN, 


(number of sources) and the throughput. Paralleling Table V, we 
consider two cases: a population of 10 and throughputs of 0.50 and 
0.75. As in Table V the service time is set equal to 1. The population 
and throughput determines the arrival rate per idle source (individual 
service rate at the IS node). This is the starting point for the FPM 
approximation, which is obtained from (28) or (88). The conditions 
are clearly much less favorable in Table VI than in Tables I and IV; 
the ratio of service rates (IS/other) was much less before. Nevertheless, 
the FPM method works quite well, at least in the case of utilization 
0.50. From Tables V and VI, we see that the FPM method does indeed 
perform better with the IS node. Related numerical comparisons are 
contained in Ref. 69. The overall performance here based on the FPM 
method is somewhat better than the performance in Ref. 69, which is 
based on matching the server utilization. The performance in Tables 
I and IV is much better than we might at first expect from Ref. 69, 
but recall that in Tables I and IV, as the total population decreases 
the server utilization decreases because the service rate at the IS node 
is held fixed. Nevertheless, the tables in Ref. 69 help assess how well 
the FPM method will perform for a small network with an IS node. 


XIII. CONCLUSIONS 


In this paper we identified and investigated three situations in 
which open queueing network models should provide good approxi- 
mations for more difficult closed queueing network models: 

1. When the closed network has many nodes (Sections II through 
V, X), 
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2. When the closed network contains a “decoupling” infinite-server 
(IS) node with a relatively low service rate (see Sections VI through 
VIII), 

3. When the closed network contains a non-IS bottleneck node 
under a fairly heavy load (Section IX). 

The suggested approximation procedures in these situations are not 
the same, however. In Case 3 we remove the bottleneck server and 
replace its departure process by an external arrival process, which is 
determined solely by the number of servers and the service-time 
distribution at the bottleneck queue. The arrival rate is the maximum 
possible service rate from the bottleneck node; we do not use the FPM 
method. In contrast, in Case 1 no nodes are removed from the closed 
network. As described in Section X, an entry node is selected and the 
external arrival process there depends on the entire network in a 
rather complicated way. The arrival rate is determined by the FPM 
method. The variability parameter of the arrival process, using QNA,’ 
is chosen so that the variability parameter of the external arrival 
process agrees with the variability parameter of the departure process 
from the network. 

It is interesting that the suggested procedure for Case 2 can be 
regarded as a variation of either the procedure for Case 1 or the 
procedure for Case 3. On the one hand, the procedure for Case 1 can 
be applied without change to Case 2. As described in Section VI, the 
suggested procedure coincides with the FPM method. However, in 
Case 2 we know that the departure process from the IS node is 
approximately a Poisson process. Hence, it is natural to implement 
the FPM method for Case 2 by replacing the departure process from 
the decoupling IS node by an external Poisson arrival process. We 
then use the FPM method to determine the appropriate external 
arrival rate, but we do not have to worry about the variability param- 
eter; we just set it equal to 1. If instead we used the FPM method as 
described for Case 1 and we selected an entry node for an external 
arrival process, then we would need to specify the variability parameter 
of the external arrival process there. In general, in Case 2 the arrival 
processes to other nodes need not be approximately Poisson. However, 
if we apply the standard FPM method for Case 1 to Case 2, then the 
results should be very similar because in Case 2 the FPM method will 
make the variability parameter of the departure process from the 
decoupling IS node nearly 1. 

We can also think of the procedure for Case 2 as a modification of 
the procedure for Case 3. The decoupling IS node also acts as a 
bottleneck queue. Hence, as described in Section VI, we can analyze 
Case 2 by removing the IS node from the closed network and replacing 
its department process by an external arrival process. Because of the 
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nature of this particular bottleneck queue, i.e., because there are many 
servers each with low service rate, it is appropriate to make the 
external arrival process a Poisson process. Incidentally, we would do 
this in Case 3 too if there were many servers, but finitely many, each 
with low service rate. 

If we apply the procedure for Case 3 directly to Case 2, then we let 
the arrival rate of the external Poisson process be the maximal possible 
service rate from the IS node, which corresponds to the first upper 
bound described in Section VI. The suggested modification is to let 
the arrival rate of the external Poisson arrival be such that this arrival 
rate would equal the departure rate at the IS node if it were included 
in the network. As indicated in Section VI, this modification turns 
out to coincide with the FPM method. 

It is important to recognize that the three situations above do not 
nearly cover all possibilities. As indicated in Section I, in some cases 
an open model might also be reasonable from direct modeling consid- 
erations; often the closed model is not entirely appropriate. However, 
it is clear from the analysis and examples here that open models do 
not always produce reasonable, let alone good, approximations for 
closed models. For closed networks with few nodes, few servers per 
node and few jobs, the open-model approximations for closed models 
here tend to perform poorly. Further experimentation is needed to 
better understand the appropriate regions for each procedure. As with 
any approximation tool, it is very helpful in applications to make a 
few initial benchmark comparisons with simulations to determine the 
actual quality of the approximations in that context. 

The specific models discussed in this paper have been relatively 
elementary. Many of the theorems only relate open and closed Markov 
Jackson networks. The major complexity considered was nonexponen- 
tial FCFS servers and the associated network model treated by QNA. 
It is important to realize, however, that the ideas apply much more 
broadly. As discussed by Zahorjan,”” these open-model approximations 
for closed models can be used as modules or subroutines in more 
complicated approximation procedures, e.g., based on network decom- 
position. As illustrated by Fredericks,*° the ideas also apply directly 
to closed models with other complicating features such as priorities. 
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On the Application of Energy Contours to the 
Recognition of Connected Word Sequences 


By L. R. RABINER* 
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It has recently been shown that small but consistent improvements in 
isolated word recognition accuracy can be obtained by supplementing the 
Linear Predictive Coding (LPC) features for each frame of a word by a 
normalized energy value for that frame. The key idea in using energy is to 
normalize the frame energy by the local energy maximum in time (i.e., relative 
to the peak energy of the spoken word). If we want to extend the concept of 
using frame energy as a supplement to the LPC feature set for connected word 
recognition, we must provide a dynamic method of energy normalization so 
that the peak energy within strings can closely approximate the energy 
contours of individual words strung together. In this paper such a dynamic 
energy normalization is proposed, and it is shown to provide improvements in 
connected word recognition applications. The normalization consists of deter- 
mining a continuous peak energy contour for the speech, where the peak 
energy is determined over periods of time essentially corresponding to a 
syllable, and then modifying the actual energy contour with the peak energy 
contour so that absolute energy maxima occur about once per syllable. In this 
manner, the dynamically normalized, temporal energy contour of the word 
string effectively provides a set of temporal markers of high-energy events 
(content words) that aid the recognition of connected word sequences. 


I. INTRODUCTION 


The effectiveness of supplementing standard spectral features with 
an energy measurement (suitably normalized) for isolated word rec- 
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ognition applications has recently been demonstrated by several re- 
searchers.'* The basic idea of these schemes is to define an enhanced 
feature set (for each frame of speech within the word to be recognized) 
consisting of a pth-order Linear Predictive Coding (LPC) vector, a, 
concatenated with a normalized frame log energy, Er, where the 
normalization is with respect to the peak energy within the entire 
word. In this manner, the frame energy value is relative to the peak 
energy within the word, and is therefore relatively insensitive to gain 
variations in transmission and/or recording. 

For connected word recognition applications, the concept of how to 
provide proper energy normalization across a sentence-length utter- 
ance is one that is potentially open to a great deal of controversy. 
There is no exactly correct mechanism for handling the energy varia- 
tions that occur naturally when words are strung together and spoken 
at various rates of articulation. However, it seems reasonable, and 
intuitively appealing, that some type of syllabic rate normalization 
should be able to highlight and identify a large fraction of the words 
(especially so-called content words) in a spoken sentence. In this 
manner, the increase and decrease in the overall energy level would 
be naturally compensated by the Automatic Gain Control (AGC) 
action of the normalization scheme. 

The major obstacle to implementing a syllabic rate, energy normal- 
ization procedure for use with connected strings is that it is almost 
impossible to design such an algorithm unless the rate of articulation 
is known. Unfortunately, for most practical situations, we do not know 
the rate of articulation of the speech; hence we are forced to choose a 
set of implementation parameters that represent a compromise over 
those that are optimum for the particular spoken string, and those 
that are optimum for a wide class of talkers, strings, etc. The design 
and implementation of the syllabic rate, energy normalization proce- 
dure is discussed in Section II. In Section III we present results of an 
experimental evaluation of the energy normalization scheme on both 
connected digit strings and on sets of airlines words for use in the 
AT&T Bell Laboratories airlines information and reservation sys- 
tem.*” Finally, in Section IV we discuss the results and their implica- 
tions for further research. 


Il. ENERGY NORMALIZATION FOR CONNECTED WORD STRINGS 


We define the log energy contour, E(m), of the connected word 
string as 


E(m) = 10 logio[ Vn(0)], m=1,2,---,M, (1) 


where V,,(0) is the zeroth-order autocorrelation of the speech, i.e., 
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Vin(0) = Dy s[n + (m — 1)L)?’, (2) 


n=0 


where M, L, and N are the number of frames in the string, the number 
of samples shifted between frames, and the frame size, respectively, 
and where s(n) is the speech signal. Typically, for telephone record- 
ings, we use a sampling rate of 6.67 kHz on the speech, and use values 
of N = 300 samples (45-ms frames), and L = 100 samples (15-ms 
shift). 

For isolated word sequences, the normalization of the log energy 
contour is straightforward, and consists of locating the peak log energy 
across the word, Fimax as, 


Emax = max [E(m)] (3) 
l<xm<M 

and normalizing the energy contour by subtracting Emax from each 

frame, i.e., 


E(m) a E(m) _ Emax: (4) 


In this manner the log energy values are constrained to have a peak 
value of 0 dB, and the stressed vowels for a word are essentially 
guaranteed to have log energy values close to 0 dB. 

Based on the above normalization procedure, reference- and test- 
word energy contours can be compared using a simple nonlinear, 
energy distance metric, which is then added to the standard LPC- 
shape distance to give an overall distance between test and reference 
frames. 

For connected word strings a more sophisticated energy normaliza- 
tion scheme is required. The idea of the normalization is to make the 
local energy maximum for each content word in the string as close to 
0 dB as possible. By content words we mean words with distinct 
stressed vowels (i.e., all digits in strings), as opposed to function words 
(e.g., “to”, “and”, “the”, “a”) in which there is often no stressed vowel 
in connected speech. Basically, what is required for performing such a 
normalization is a syllable detector. Although several approaches to 
syllable detection have been described in the literature,~!° we chose 
to implement a simple, signal processing approach to normalization, 
which is felt to be more appropriate to the problem at hand than other 
alternatives. 

A block diagram of the log-energy-normalization algorithm for con- 
nected word strings is given in Fig. 1. The log energy contour, E(m), 
m= 1,2, .--, M, of the speech signal, s(n), is first computed according 
to eq. (1). A “syllabic rate” energy envelope, V(m), is computed as 
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V(m) = max [E(q)], (5) 


max| Lm MY cgemin| Mime | 
where the parameter NW is the number of frames over which the 
energy envelope maximum is computed. (We have considered values 
of NW from 15 to 35, i.e., five to two syllables per second.) 

The syllabic rate, energy envelope contour, V(m), is next smoothed 
by a median smoother” with a smoother duration of NM frames, 
where NM is typically chosen to be about half the size of NW, i.e., 10 
to 20 frames. The median smoother eliminates “sharp” dips in the 
syllabic rate, energy envelope contour between syllables. 

The final step in the process is to modify the log energy contour, 
E(m), by the median-smoothed, syllabic rate energy envelope, V(m), 
to give 


E(m) = E(m) — V(m), (6) 


which is the final, normalized, log energy envelope. 

Figures 2 through 5 illustrate the algorithm for four sets of word 
strings. In each of these figures, the upper plot shows E(m) (normal- 
ized so that its global peak across the string is set to 0 dB), and E(m) 
(dashed line) superimposed; the lower plot shows V(m) and V(m) 
(most of the time they are identical). 

Figure 2 shows results for the connected digit string /54110/ spoken 
at a fairly deliberate rate (2-1/2 digits per second or 150 digits per 
minute). Figure 2b shows that the energy envelope exhibits approxi- 
mately a 7.6-dB variation from the first digit peak to the fourth digit 
peak. After peak energy normalization, each of the five digits in the 
string is clearly marked and each attains a 0-dB energy peak during 
the stressed vowel. 

The example of Fig. 3 is for the digit string /5820/ spoken fairly 
rapidly (175 digits per minute). For this case the digit 2 is not properly 
normalized, since the median smoother misses the energy envelope by 
about 2 dB. However, each of the four digits in the string is more 
distinct in the normalized energy contour than in the original energy 
contour. 

The example of Fig. 4 is for the sentence, “I want to make a 


Nn ray 
LOG EMCECE Vim) | mepian [| Vir) ENERGY E(m) 
ENERGY k Beales = SMOOTHER Re NORMALIZATION 


, Fig. 1—Block diagram of dynamic energy normalization scheme for connected word 
strings. 








1984 TECHNICAL JOURNAL, NOVEMBER 1984 





ENERGY IN DECIBELS 


FRAME NUMBER 


Fig. 2—(a) Log energy contours (original plus normalized) and (b) peak energy 
envelope contours (original plus median smoothed) for the digit string /54110/. 


reservation”, spoken at a rate of 221 words per minute. The energy 
normalization does a good job for the content words. “I”, “want”, and 
“make”, but is not able to handle the brief, unstressed words “to” and 
“a”, and actually provides a double normalization for the word “reser- 
vation”, because of the presence of two stressed vowels in the four- 
syllable word. The inability of the algorithm to handle the very short 
function words in continuous speech is inherently unalterable, and the 
recognition algorithm, which ultimately uses the normalized energy 
contour, must still work reliably in the face of this type of shortcoming. 
Similarly, the detection of multiple stressed vowels with a single 
polysyllabic word is a natural result of the detection process, and must 
be properly handled by the recognizer. We will discuss these points 
further in Section III. 

The final example of this group, Fig. 5, shows results for the 12- 
word sentence, “I would like to return on Wednesday afternoon the 
one three October”, spoken at a rate of 172 words per minute. For this 
sentence a large range of energy values for the individual words is 
exhibited (i.e., 15.3 dB on the lower plot), and even this large a range 
is not quite enough to handle each of the content words in the sentence. 
The only word that was not properly normalized was “would”, which 
was highly reduced. The words “Wednesday” and “October” both had 
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Fig. 3—Log energy contours (original plus normalized) and peak energy envelope 
contours (original plus median smoothed) for the digit string /5820/. 


two stressed vowels and hence were normalized to 0 dB at two places 
within the word. 


Ill. EXPERIMENTAL EVALUATION 


To evaluate the effectiveness of the energy normalization algorithm 
for connected word strings, a series of three experiments were run. 
For the first experiment, we performed a recognition test on 1520 
connected digit strings from 19 talkers. All recordings were made over 
local dialed-up telephone lines and all recognition tests were run using 
the level-building, Dynamic Time Warping (DTW) algorithm” in a 
speaker-independent mode using word templates extracted from a 
different set of talkers.’*"1*4 Details of the way in which the reference 
set were extracted are given in Ref. 14. 

The second recognition experiment used a vocabulary of 129 airlines 
terms and a deterministic language model (i.e., a grammar) to specify 
allowable sentences in the language. For this experiment, a syntax- 
directed, level-building, DTW algorithm’ was used as the recognizer. 
There were six test talkers, each of whom spoke a balanced set of 51 
sentences from the language. (The set was balanced in terms of usage 
of words in the vocabulary and in terms of covering all major paths in 
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Fig. 4—Log energy contours (original plus normalized) and peak energy envelope 
contours (original plus median smoothed) for the sentence “I want to make a reserva- 
tion”. Each word in the sentence is demarked (approximately) by vertical dashed lines. 


the grammar.) The list of 51 sentences used in this experiment is given 
in Table I. A total of 438 words occurred in the 51 sentences; hence 
the average sentence duration was somewhat over eight words. Four 
of the six test talkers provided a set of isolated-word training patterns 
for the 129-word vocabulary using the robust training procedure of 
Rabiner and Wilpon.’ For these four talkers we ran both speaker- 
dependent and speaker-independent recognition tests; for the other 
two talkers only speaker-independent recognition tests were run. The 
speaker-independent runs used a speaker-independent, isolated-word 
reference set obtained by means of a clustering analysis of the word 
tokens of 100 different talkers (50 male, 50 female).?” 

The third recognition experiment again used the 129-word airlines 
vocabulary, but substituted a level-building, Hidden Markov Model 
(HMM) for the DTW recognizer.'® Single-word HMMs were designed 
for each of the 129 words in the vocabulary, based on the same training 
set from which the speaker-independent word templates were created. 
(No speaker-dependent models were used in this experiment.) Word 
models were concatenated, according to the language model (the 
deterministic grammar) using the level-building concept to link ends 
of one model to the beginnings of the next model. The individual word 
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Fig. 5—Log energy contours (original plus normalized) and peak energy envelope 
contours (original plus median smoothed) for the sentence “I would like to return on 
Wednesday afternoon the one three October”. Each word in the sentence is demarked 
(approximately) by vertical dashed lines. 


models each had ten states, and used an energy-based Vector Quantizer 
(VQ)? with 128 code-book entries. 


3.1 Results of experiment 1—connected digits 


The results of the connected digits runs are given in Table I and 
Fig. 6. The 1520 strings were divided into two groups of 760 strings 
each; the first group was spoken at deliberate rates (about 135 digits 
per minute), whereas the second group was spoken at normal rates 
(about 170 digits per minute). String error rates were measured for 
the top 6 candidates (i.e., the probability that the correct string was 
not in the 8 best strings) with string length unknown, and for the top 
candidate for known string lengths. A value of 8 of five was used for 
these tests. . 

Table II shows the 6 = 1 results for a recognizer without energy 
(i.e., using J-PC vectors alone); a recognizer using energy, where only 
a global peak normalization (similar to the algorithm for isolated 
words) is used; a recognizer with energy, using the dynamic normali- 
zation procedure of Section IJ; a recognizer with a shape VQ with 128 
code-book entries; and a recognizer with an energy VQ with 128 entries 
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Table |—Sentences used to evaluate the airline recognition system 


1 I want to make a reservation. 
2 I would like some information please. 
3 I want to go from New York to Los Angeles on Tuesday morning. 
4 I would like to return on Wednesday afternoon the one three October. 
5 I would like a nonstop flight. 
6 When do flights leave Philadelphia for Detroit on Monday afternoon? 
7 I want to go at twelve o’clock. 
8 I would like to depart at night. 
9 I want to leave in the morning. 
10 I want to depart from Boston on the evening of the oh nine November. 
11 How many flights are there from Washington to Denver on Thursday night? 
12 How many flights go from Seattle to Miami on the two eight February? 
13 What plane is on flight two six to Chicago? 
14 How many stops are there on the flight? 
15 I would like flight number four one. 
16 I will take flight five three. 
17 I would like a first class seat. 
18 I need three seats. 
19 I want one coach seat. 
20 What is the flight time from Boston to Chicago. 
21 Is a meal served on the flight to Denver? 
22 How much is the fare? 
23 What is the fare from Detroit to Philadelphia on Sunday night? 
24 When does flight number two from Los Angeles arrive? 
25 At what time does flight seven one to Seattle depart? 
26 My home phone number is area code two oh one six two four one two four six. 
27 My office phone number is five three six two one five two. 
28 Please repeat the arrival times. 
29 Please repeat the departure time. 
30 I will pay by credit card. 
31 I prefer the Lockheed ten eleven. 
32 I prefer the Boeing seven four seven. 
33 I prefer the D.C. nine. 
34 I prefer the Douglas D.C. ten. 
35 I prefer the B.A.C. ten. 
36 I will pay by Master Charge. 
37 I will pay by cash. 
38 I will pay by Diners Club. 
39 I will pay by American Express. 
40 I want to go at eleven a.m. 
41 I want to go at six p.m. 
42 I want to return to Chicago on the three oh December. 
43 I would like to depart on Friday evening. 
44 I would like one first class seat on flight number four four to Los Angeles. 
45 I want to return on the oh nine March. 
46 I want to go to Washington on the two four April. 
47 I would like to return to New York on the oh one May. 
48 I want to leave for Los Angeles on the morning of the one four June. 
49 I want to go from Boston to Philadelphia on Tuesday morning the oh four July. 
50 I would like to return on the oh seven August. 
51 At what time do flights leave Boston for Denver on the two seven September? 


and dynamic energy normalization. Figure 6 shows the string error 
rate, as a function of 8, for the five recognizers described above. Based 
on the results of Table II and Fig. 6, the following observations can 
be made: 

1. For connected digit strings, there is essentially no advantage to 
using energy in addition to LPC shape. The only case in which energy 
provided a significant performance improvement was for deliberately 
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Fig. 6—String error rates as a function of candidate position for (a) deliberate strings 
and (b) normal rate strings for five recognition conditions. 
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Table !I—String error rates in percent for connected digit strings 


Deliberate Strings Normal Rate Strings 

Length Un- _—— Length Length Un- _— Length 

Condition known Known known Known 
No energy 12.4 4,9 7.4 5.3 
Energy-peak norm 16.3 14.3 21.3 19.2 
Energy-dynamic norm 8.8 5.9 9.2 7.2 
No energy-shape VQ 15.9 9.2 13.2 11.1 
Energy VQ 12.4 8.6 15.3 12.8 


spoken digit strings whose length was unknown. For all other cases 
there was a small loss in performance when energy was incorporated 
into the recognizer. 

2. Improper normalization of the energy contour leads to significant 
degradation in performance on connected digit strings. This result 
shows that the dynamic normalization procedure is indeed providing 
a better model for the energy contours of individual words than those 
obtained from just using the original energy contour of the utterance. 

3. The small performance degradation for normal rate strings of 
unknown length is essentially only for the top recognition candidate. 
As seen in Fig. 6, for candidate positions 2 through 5 the performance 
with dynamic normalization of energy is indeed slightly better than 
without energy. 


3.2 Results of experiment 2—airlines sentences using DTW 


The results of the recognition runs using the airlines vocabulary 
and grammar, and using the DTW level-building recognizer are given 
in Table Illa. This table shows average string and word error rates for 
both the speaker-dependent and speaker-independent runs for two 
conditions, namely, the recognizer without energy (i.e., using only LPC 
in the distance) and the recognizer with the dynamic energy normali- 
zation. 

The results of Table IIIa show that in the speaker-dependent mode, 
the improvement in both sentence and word accuracy is dramatic (7.4 
percent and 1.7 percent, respectively). In the speaker-independent 
mode there is an improvement in performance of 1.1 percent in string 
error rate when using energy, but the word error rate is essentially the 
same for both conditions. Presumably this result is due to the diversity 
of patterns and energy.contours in the 12-template-per-word reference 
set; hence the reliance on energy to provide marker points during the 
word string is considerably less than for the speaker-dependent runs. 


3.3 Results of experiment 3—airlines sentences using HMM 


The results of the recognition runs using the airlines vocabulary 
and grammar, and using the HMM level-building recognizer are given 
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Table III—A comparison of string and word error rates for airline 
sentences using a DTW level-building algorithm and an HMM 
level-building algorithm 

Speaker Dependent Speaker Independent 


String Error WordError StringError Word Error 
Condition Rate Rate Rate Rate 


(a) String and Word Error Rates in Percent for Airlines Sentences Using DTW 
Level-Building Algorithm 


No Energy 20.6 5.5 26.9 74 
Energy-Dynamic Norm 13.2 3.8 25.8 7.5 
String Error Word Error 
Condition Rate Rate 


(b) Speaker-Independent String and Word Error Rates in Percent for Airlines 
Sentences Using an HMM Level-Building Algorithm 


Energy-Peak Norm 34.0 8.6 
Energy-Dynamic Norm 25.1 6.7 


in Table IIIb. This table shows average string and word error rates for 
two conditions, namely, using energy with only global peak normali- 
zation, and using energy with dynamic normalization. (A partial run 
was made without energy, but the string error rates were on the order 
of 95 percent! Hence, for the HMM recognizer, the use of energy, in 
some form, is mandatory.) 

The results of Table IIIb again show a dramatic reduction in both 
string and word error rates when the dynamic energy normalization is 
used (i.e., 8.9 percent and 1.9 percent, respectively). Comparing the 
results to those given in Table IIIa it can be seen that the HMM level- 
building recognizer (which uses a 128-codeword VQ) actually outper- 
forms a 12-template-per-word, DTW, level-building recognizer without 


VQ. 


IV. DISCUSSION 


The results presented in this paper on the use of energy along with 
LPC for recognition of connected word strings indicate the following: 

1. Simple application of the peak energy normalization scheme 
appropriate for isolated words leads to poor performance for connected 
word systems. 

2. Improved performance can be obtained by using a dynamic energy 
normalization, which essentially adjusts the energy contour according 
to the local maximum over a time duration roughly corresponding to 
a syllable. 

3. For relatively simple vocabularies, such as the digits, the infor- 
mation contained in the energy contour is, at best, only marginally 
useful for improving recognizer performance. The condition under 
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which it performs the best is in reducing digit insertions for rates of 
articulation that are fairly low. For normally spoken connected digit 
strings, there is actually a small degradation in performance when the 
energy contour is used. 

4. For more complex vocabularies, such as the set of airlines terms, 
the information contained in the energy contour can and does improve 
the performance of the recognizer on connected word strings; in some 
cases the improvements are quite dramatic. The reason for this im- 
provement in performance is that the energy contour, when properly 
normalized, essentially highlights the content words in the sentence 
and provides a boost to the alignment of words from the grammar. 

There are several issues concerning the implementation of the 
energy normalization that should be discussed here. First of all, it 
should be clear that this, and any other proposed energy normalization 
scheme, is essentially an ad hoc procedure for highlighting words in 
connected strings. There is no exactly correct method for performing 
the appropriate normalization; at best, we can hope that the proposed 
method has some desirable properties and performs well in some 
typical applications. 

A second point concerns the variable parameters, NW and NM, of 
the implementation of the energy normalization algorithm. We have 
experimented with values of 10 < NW < 35 and 10 < NM < 25, and 
have found that the performance results are relatively insensitive over 
a wide range of values of NW and NM. This is a highly desirable 
result in that a fixed set of values can be chosen and used in all 
circumstances. However, it should be clear that, in individual cases, 
when the rate of articulation is high (e.g., over 200 words per minute), 
values of NW and NM near the lower limits will give better perform- 
ance than those near the upper limits. Conversely, for strings articu- 
lated at low rates (near 100 to 130 words per minute), values of NW 
and NM near the upper limits will give the best recognition perform- 
ance. 

Finally, the issue arises as to how to handle polysyllabic words with 
more than one stressed vowel. For our runs we have made no attempt 
to do anything special for such cases, since the energy contours of the 
isolated word tokens, in these cases, naturally exhibit two strong 
(almost equal level) energy peaks. The result indicates no special 
problems with such polysyllabic words. We did do one check-in which 
the isolated word reference energy patterns themselves were passed 
through the dynamic energy, normalization procedure and then used 
in the DTW recognizer. The results were one-for-one identical with 
those obtained without this reference energy correction procedure. 
Hence we conclude that multistressed, polysyllabic words present no 
real problems for the dynamic energy, normalization algorithm. 
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V. SUMMARY 


In this paper we have proposed one approach to dynamically nor- 
malizing the energy contour of a connected word string so that energy 
can be used along with LPC spectral shape in the recognition of 
connected word strings. We have shown the approach to be reasonable 
from the point of view of finding content words in the string and 
bringing their energy levels to be local peaks of essentially fixed level 
in the string. 

Recognition results indicate that energy is primarily useful for 
complex word vocabularies but is at best marginal for simple (mono- 
syllabic) word vocabularies such as the digits. In all cases we have 
shown that the proposed dynamic energy normalization outperforms 
the simple peak energy normalization procedure that was shown to be 
suitable for isolated word sequences. 
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Spatial Filtering Radio Astronomical Data: 
One-Dimensional Case 


By H. E. ROWE* 
(Manuscript received May 26, 1983) 


Radio astronomical measurements of radio brightness are made by pointing 
an antenna at a regular array of points in the sky and measuring the received 
noise power at each point. In the absence of receiver noise, the measured 
brightness is the convolution of the true brightness distribution with the 
antenna effective area (i.e., receiving power pattern), evaluated at the point of 
observation. Front-end noise in the radiometer receiver adds fluctuations 
inversely proportional to the observing time at each measured point. From 
such data, we calculate optimum mean-square estimates for two quantities: 
measured brightness between observations, and true brightness at and between 
observations. The first is interpolation; the second, called restoration, partially 
deconvolves the antenna pattern from the measured data. We determine the 
errors associated with each, as functions of: (1) receiving antenna pattern, (2) 
separation between observations, and (3) radiometer output signal-to-noise 
ratio. These results permit the construction of maps of measured and true 
brightness, with known mean-square errors. In this paper we study the one- 
dimensional version of this problem, assuming a large number of measured 
points. We find that measured points should be separated by about half the 
(full) 3-dB beamwidth for conventional antennas. Restoration is more costly 
than interpolation. 


I]. INTRODUCTION 


Radio astronomical measurements of radio brightness are made by 
pointing an antenna at a regular array of points in the sky and 
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measuring the power of the random signal at the antenna output with 
a radiometer receiver. The measured points may lie in a square array; 
alternatively, we may wish to consider a hexagonal array, or irregular 
arrays of measured points. 

We denote the antenna output power for each measured point as 
the measured brightness; the measured brightness is the convolution 
of the true brightness distribution with the antenna receiving power 
pattern, evaluated for the particular point under study. The radiometer 
receiver has a “signal” output proportional to the measured brightness, 
plus a “noise” output, due principally to the receiver front-end noise 
(which far exceeds the received power at the antenna output). The 
receiver “noise” output has mean-square value proportional to the 
receiver noise temperature squared and inversely proportional to the 
observation (or integration) time (see the Appendix of Ref. 1). 

We wish to construct maps of radio brightness from such discrete 
measurements. First, let us estimate the measured brightness between 
observations, i.e., what we would have observed if we had pointed the 
antenna in between the actual measured points. This has been called 
“interpolation”.” 

However, the antenna does not have an infinitely narrow beam, and 
hence the measured brightness is a smoothed version of the true 
brightness. We wish to deconvolve the antenna pattern so as to 
determine the true brightness distribution as closely as possible. This 
has been called “restoration” .” 

For a given antenna, two factors that limit accuracy for either 
interpolation or restoration are the separation between measurements 
and receiver noise. 

Suppose first that receiver noise is absent. Measurements that are 
too widely separated will provide little information about the bright- 
ness at points far from the observations. However, measurements that 
are too close yield redundant information. Receiver noise limits the 
accuracy for both interpolation and restoration; however, restoration 
is much more severely affected since it involves deconvolution of the 
antenna pattern. 

Receiver noise at each point is inversely proportional to the obser- 
vation time at that point, as noted above. Therefore, if we allot a given 
total amount of time for a particular region of the sky, we can trade 
off separation between observations against signal-to-noise ratio. That 
is, we may choose few widely spaced observations, with long observing 
times and hence small noise, or alternatively many closely spaced 
observations with short observing times and hence large noise. 

We require the optimum mean-square estimates for interpolation 
and restoration. These results will yield the best angular separation 
between observations, as well as the relative costs of restoration and 
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interpolation. The optimum estimates depend on the statistical model 
assumed for the brightness distribution; we assume the brightness 
varies rapidly compared to the antenna beamwidth. 

This paper explores the one-dimensional version of this problem, 
an antenna with a strip aperture and a fan beam. We assume an 
infinite number of equally spaced observed points. The mean-square 
interpolation and restoration errors are determined as functions of the 
separation between observations, antenna width, and signal-to-noise 
ratio for two (one-dimensional) antenna illumination functions: 

1. Uniform illumination (maximum gain antenna) 

2. Truncated Gaussian illumination with a 15-db taper. 

For this one-dimensional problem we find that observations should 
be separated by about half the (full) 3-db beamwidth in the case of 
truncated Gaussian illumination, which corresponds to a normal an- 
tenna. For a given error, restoration is much more expensive than 
interpolation in observing time. 

The present results serve as a guide for similar studies of the real 
two-dimensional case, i.e., an antenna with a circular aperture and a 
pencil beam. 


Il. SAMPLED-DATA MODEL FOR MEASUREMENT OF A ONE- 
DIMENSIONAL INCOHERENT FIELD BY A FAN-BEAM ANTENNA 


Consider a strip antenna with its aperture located in the x-y plane, 
as we see in Fig. 1. The aperture has width W along the x axis, and is 
centered about the y-z plane; the aperture extends to + along the y 
axis. All field quantities are assumed to be independent of y; therefore, 
the antenna has a fan beam, with gain and effective width (as trans- 
mitting and receiving antenna, respectively) that depend on only the 
angular coordinate t, measured in the x-z plane, of a cylindrical 
coordinate system shown in Fig. 1. The angle t is not measured in 
radians, but rather is suitably normalized to simplify the following 
relations; the details of this normalization are unnecessary for our 
present purposes. Suppose for the present that the antenna beam 
points along the z axis; denote the effective width by A(t), with Fourier 
transform °/(f). 

Let the aperture electric field (in the x-y plane) be polarized along 
the x direction, and denoted by E(x). Then for narrow-beam antennas 
Af) « E(f) @ E*(-f), where we use the symbol © to denote 
convolution throughout. Since E(x) is zero outside of the aperture, 
ie., for |x| > W/2, it follows that </(f), the Fourier transform of the 
effective width A(t), is strictly bandlimited to |f| < W. 

We measure a one-dimensional, incoherent field, with radio bright- 
ness x(t), by pointing this antenna in direction t, denoting the power 
out of the antenna feed as x,(t), which we call the measured brightness. 
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Fig. 1—Geometry of strip antenna. 


The convolution of x(t) with A(t) is x.(t) = x(t) © A(t). Such mea- 
surements are repeated at equally spaced angles t = kT, --- , — 1, 0, 
1, --- , providing measured brightness samples x,(kT). Independent 
noise is added by the receiver to these samples; and from these noisy 
samples we wish the optimum linear estimate of the brightness x(t) 
(restoration), or of the measured brightness x,(t) (interpolation). Fig- 
ure 2a shows these measurements. 

The symbols in Fig. 2 have the following definitions for the one- 
dimensional antenna problem: 


t normalized angular coordinate. 

f normalized spatial frequency. 

x(t) radio brightness (one-dimensional). 

W antenna width. 

E(x) aperture electric field; zero for |x| > W/2. 
A(t) effective width (as a receiving antenna). 


SW (f) Fourier transform of A(t); approximately proportional to 


E(f) © E* (-f) in narrow-beam approximation, strictly 
bandlimited to |f| < W. 


kT observing angles. 

x.(RT) receiver signal samples. 

n(t) random function; n(kT) = receiver noise samples, inde- 
pendent for different k. 

N = (n*(kT)), expected sample noise power; proportional to 


receiver noise temperature squared and inversely propor- 
tional to integration time. 

h(t) weight function for data samples x,(kT) + n(kT). 

H(f) Fourier transform of h(t); transfer function used as spatial 
filter on data samples. 
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y(t) = xr(t) + p(t) 
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Fig. 2—Sampling and reconstruction of a stationary random function. (a) Wideband 
input—general input and reconstruction filters. (b) Bandlimited version of input. 


The effective width A(t) and its Fourier transform </(f) satisfy the 
following relations in the narrow-beam approximation, with suitable 
normalization of the angular coordinate t: 


0<A(t) < 


min (Z. we) 


bo 


E(x)E*(x — f)dx 


*(_ max Eee, 
Af) = Ae eA) = ilar ty 
2 2 
Ji 120 Pas eo 
2 2 
(0) at A(t)dt=1; W(f) =0, If| > W. (1) 


We do not consider super-gain antennas in the narrow-beam approx- 
imation. Maximum effective width, A(0) = W, is attained for uniform 
illumination of the antenna aperture with zero phase error, E(x) = 1 
for |x| < W/2, i.e., for an antenna with maximum on-axis gain. For 
this case .c/(f) is the triangular function illustrated in Fig. 2a: 
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1-|/Wl,  |flsW 
of (f) = (2) 
: 0, Ifl=W. 


Alternatively, the block diagram of Fig. 2a may be considered to 
represent a sampled-data system. An input time function x(t) is 
convolved with A(t) or equivalently filtered by its Fourier transform 
Af), producing x,(t). Noise n(¢) is added to x,(t), and their sum is 
sampled to produce noisy samples x,(kT) + n(RT), with different noise 
samples independent. The noisy samples are filtered by H(f) to 
produce optimum linear estimates of x(t) or of x,.(t). Much of the 
following discussion will be carried out for this sampled-data system, 
which is equivalent to the measurement of a one-dimensional inco- 
herent field by a one-dimensional antenna. 

It remains to specify the spectra of the input signal x(t) and the 
noise n(t) in Fig. 2a. We have assumed the radio brightness x(t) varies 
rapidly with respect to the antenna beamwidth, i.e., with respect to 
the width of A(t); this would arise, for example, from a random 
distribution of point sources, dense compared to the antenna beam- 
width. Since power is positive, brightness is also positive, and conse- 
quently, (x(t)) = 0. We assume that the dc component of x(t) is 
deterministic (see Appendix A), and we treat it separately. The ac 
component of x(t) has power spectrum P,(f) essentially white, i.e., 
wide compared to the bandwidth W of /(f); we denote the spectral 
density of P,(f) within this band |f| < W by X. 

Finally, the noise spectrum P,(f) of Fig. 2a—white, bandlimited, 
with spectral density NT in the band |f| < 0.5/T—will yield inde- 
pendent noise samples n(kT) with power N, as we assumed above. 

Consider the block diagram of Fig. 2a as a sampled-data system. 
The stationary input x(t) is filtered by .e7(f), which is bandlimited to 
W as indicated, producing the quantity x,(t). Since P,(f), the power 
spectrum of x(t), is white with spectral density X within this band, 
X(t) at the output of 2/(f) will have power spectrum 


P,(f) = | L(A) PP(f) = X-| L(A)’. (3) 


Noise is added, and the noisy filtered output is sampled at interval T; 
the noise samples are independent, with power N. Finally, a recon- 
struction filter H(f) yields an output signal x,(t) and noise n,(t). 

All frequency components of the original input x(t) outside the band 
W, |f| > W have been lost. Therefore, we can only estimate the 
bandlimited version of x(t) (Fig. 2b), i.e.: 


sin 27Wt 
xw(t) = x(t) ®@ 2W oa. F (4) 


Alternatively, we might wish to estimate x,(t), at the output of the 
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filter o/(f) of Fig. 2a. Consequently, we define the following two 
errors: 


ew(t) = x,(t) — xw(t). 
e.(t) = x(t) — x(t). (5) 


The quantities e(t), which represent the errors in the absence of noise, 
consist of linear distortion plus aliasing. Noting that x(t) and n(t), 
and hence x,(t) and n,(t), are independent, the output error e(t) is 
independent of the output noise n,(t). Consequently, the total mean- 
square deviation between desired and actual output is given in each of 
the two cases as follows: 


(div(t)) = ([y(t) — xw(t)]?) = (et(t)) + (n2(t)). 


(do(t)) = ([y(t) — xo(t)]?) = (ed(t)) + (nz (t)). (6) 


Here the symbols (_) indicate an ensemble average and the symbol 
indicates a time average over one sampling interval T, in the 
sampled-data model of Fig. 2. The quantities in (6) will depend on 
H(f), the transfer function of the reconstruction filter. We might want 
to minimize either of them; the transfer functions that do so are 
denoted Hw(f) and H,(f), respectively. By analogy to the terminology 
used for the two-dimensional antenna problem in Section I,” we call 
the estimation of x,(t) “interpolation”, and the estimation of xw/(t) 
“restoration” in the present one-dimensional problem. 
We distinguish two cases: 








WT < 0.5; oversampled 
WT > 0.5; undersampled. 


In the undersampled case the (e?(t)) of (6) comprise both linear 
distortion and aliasing; in the oversampled case aliasing is absent. The 
(n2(t)) of (6) arise from noise in both cases. 

The oversampled case is the simplest. Here aliasing is absent, as we 
noted above. Both the linear distortion and, as we see below, the 
output noise are stationary. Consequently, we may drop the time- 
averaging symbols throughout (6). 

In the undersampled case both the aliasing and the output noise are 
nonstationary. While the time-averaged quantities in (6) are easy to 
compute, we need in addition these quantities as functions of time, 
i.e., with the time-averaging symbols in (6) removed. 

The optimum filters Hy(f) and H,(f), which minimize the mean- 
square deviations in (6), are Wiener least-square-error filters. We 
show for the undersampled case that Wiener filters minimize not only 
the time-averaged mean-square deviations, but also the time-depend- 
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ent mean-square deviations. The spectra of linear distortion, aliasing, 
and noise are all simply additive. 

Such filters operate on all the data samples x,(k7’) + n(RT), —-w< 
k < , In practice we will estimate xy(t) or x,(t) from a finite number 
of data samples. Here it is no longer possible to separate the contri- 
butions of linear distortion and aliasing; the errors are nonstationary 
in both the undersampled and oversampled cases. We defer treatment 
of this problem to a future paper. 

We treat these various cases below as an introduction to the two- 
dimensional antenna case. We take the following quantities as given: 


x lim P,(f), f # 0, low-frequency limit of continuous compo- 
f-0 


nent of spectral density of x(t). 

(x(t)) expected value of x(t). 

T sampling interval. 

N mean-square sample noise, independent for different sam- 
ples. 

Wf) input filter transfer function: .27(0) = 1; o(f) =0, |f| = W. 


We assume 
lim { P,(fdf = (x(t))?. (7) 
«0 —e 


IH. GENERAL FILTERS 


The signal and noise outputs in Fig. 2a are given in terms of the 
measured sample values as follows: 


x(t) = T y xA(RT)h(t — kT) 


==— 00 


n(t)=T YY n(kT)h(t — kT). (8) 
k=—00 
The weight function, h(t), is the impulse response of the reconstruction 
filter of Fig. 2, i.e., the Fourier transform of H(f). The output y(t) 
may be intended as an estimate either of xw(t) or of x,(t) of Fig. 2, 
with errors eyw(t) or e,(t), respectively, (5), and noise n,(t). 
The power spectra of these errors are given, respectively, as follows: 


Poy) = 1H? SPs (r-2)+ [1 — HPAP) Py f) 


n¥0 
P.Af) = | H(A)? 2 P;, (- m4 |1 — A(f)|?P;,(/). (9) 
n#0 
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Here P,,(f) and P,,,(f) are related by (3) and (4): 


X,|f|<W 


PLA) =| LA? PaylAs Pay f) = (10) 
0, |f| = W. 


The first terms of (9) represent aliasing; the second terms represent 
linear distortion of the signal. The mean-square errors are** 


[oe] 


Dy = WD) = | Papat (11) 





where e(t) represents either ey(t) or e,(t). The time average, indicated 
by in (11), may be taken over any integral number of sampling 
intervals T, in view of the stationarity of x(t). 

The output noise power spectrum is 


P,,(f) = NT-| H(f)/?. (12) 


The mean-square output noise is 








(nz(t)) = (nz(t)) = NT a | H(f) |?df, (13) 


where again the time average may be taken over any integral 
number of sampling intervals T. 
Af) is strictly bandlimited to |f| < W; then from (3) or (10) 


P,(f)=0, Ifl 2 W. (14) 


The output y(¢) contains only alias and noise components outside the 
band |f| = W; the reconstruction filter transfer function should 
certainly be zero there. Consequently, we assume 


A(f)=0, Ifl2Ww (15) 


throughout the remainder of this paper. 

By (14) if the sampling is fast enough, WT' < 0.5, alias and signal 
components are separated in (9). Since H(f) satisfies (15), aliasing is 
absent in the output; both x,(t) and n,(t) are stationary, and we may 
drop the time averages” _ in (11) and (13). Thus in the oversampled 
case: 


Ww 
(ew(t)) = te |1 — H(f)27(f) |?P.(fdf 


Ww 
(e5(t) = i |1 — H(f) |*P;,(Aaf W <= sm: 


Ww 


(n;(t)) = NT ie | H(f) |°df (16) 
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In the undersampled case the expected values on the left-hand side 
of (16) become periodic functions of time. We make the additional ad 
hoc assumption that 

Ww< ee (17) 
si > 
then from (14) no more than two terms overlap in (9). This assumption 
is appropriate for the antenna problem described in Sections I and II. 
Then from Appendix B: 


Ww focal 
(e%y(t)) = 2 f ibe (/- iow" -1+ Hf) 
Px, fdf 


2 














Ww Sask 1 2 
(eo(t)) = 2 f aa & | (/- i)- 1+ HA(f)| P.(fdf 
(n2(t)) = 2NT i gay ( - 7) + H(f)| df. (18) 








Time averaging (18) subject to (15) yields directly the results obtained 
from (9) and (11) and from (13). This is readily seen by substituting 


pe 
le’ ™P + Q|?=|Pl?+1Q/? (19) 


into (18), with appropriate choices for P and Q. Finally, the results 
(18) simplify when H(f) and /(f) are real. This special case is 
significant because, as we show in Section IV, the optimum filter H,(/) 
is always real, and the optimum filter Hy(f) is real for real .o/(f), i.e., 
for antenna illumination with zero phase error. These simplifications 
are obtained by substituting 


i 


t 
—jQr— 
le : ™P+Q|? = P* + Q’ + 2PQ cos 2x =, 


PandQ real, (20) 
into (18). 


IV. OPTIMUM FILTERS 


The results of optimum linear mean-square filter theory are sum- 
marized as follows. Figure 3 shows an input signal x;(t) filtered by an 
input filter with transfer function (f/f), to yield a measured signal 
X(t). Noise v(t) is added to x,(t). We obtain linear least-mean-square 
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v(t) Holf) 


Fig. 3—Optimum linear mean-square estimation. 


estimates for the input signal x;(t) and the measured signal x,(t) by 
filtering x,(t) + v(t) by H;(f) and H,(f), respectively, as shown in Fig. 
3. The transfer functions of these Wiener filters are given as follows:° 


Ai(f) = —— (21) 


P,(f) 
Af) + — 
D+ FPP. 

Pf) 
P,A(f) + Pf) 
Here P,,(f), P:,(f), and P,(f) are the power spectra of the input signal 
xi(t), the measured signal x,(t) at the output of the filter o/(f), and 

of the additive noise v(t), respectively. Note that 

P,(f) = | (fF) |?Ps(f) (23) 


has been used to obtain the right-hand relation of (22). 
Define the deviations of the estimates y,(t) and y,(t) of Fig. 3 from 
their desired values as follows: 


d,(t) = yi(t) — x,(t) 
d,(t) = Yo(t) — Xo(t). (24) 


The power spectra of these deviations are minimized by the optimum 
filters of (21) and (22), as follows: 


A(f) = = W(f)Ai(f). (22) 


___PAPPAA) 
Pal) = TTP EP.AA) + PAD 
pepe OE 56 


Pee PAG) 


The corresponding minimum mean-square deviations are obtained by 
integrating these spectra. 

These results are applied to minimize the mean-square deviations 
(6) by the following substitutions in (21) through (25): 
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xi(t) > xw(t) 
d;(t) > dw(t) 
PAf) > Pry(f) 
Ai(f) > Hw(f) 


PAfy—> SP, (- #) NT. (26) 


n#0 
Thus the noise p(t) of Fig. 3 is replaced by the noise plus alias spectra 
of Fig. 2.°° Recall that o/(f) = 0, |f| = W, and that P,,(f) and 
P,,,(f) satisfy (10). Consequently, Hw(f) and H.(f) both satisfy (15). 
The additional condition (17) eliminates all but the n = +1 terms in 
the summation of the last line of (26). We summarize these results for 
interpolation: 















































2 
Hf) = ee a (21) 
|.of(f)|? + o{t- 3) + oA r+ 3) +> 
2 2 
Pal) = p-x| | (1-3) + jor 3) M2] (28) 
and for restoration: 
1 
Ayw(f) = apie) (29) 
_ Palf) 
Pay(f) = | o(f) |? 
2 2 
load 
7 1\|* ne wr (8! 
| of(f)|? + or - 3) + o(r+3) + 
2 
By (17), the terms (1-3) aaa o(1+3) in (27), (28), and 














(30) never overlap. As |f| — W, since °/(f) — 0, then H,(f), Hw(/), 
and P,(f) all — 0 but Pa,(f) — X. 
V. UNIFORM ILLUMINATION (MAXIMUM-GAIN ANTENNA) 
From (2) 
1-|f/W|, \lflsWw 


Bf) = (31) 
: 0, fl = W. 
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We distinguish two cases: 


0< WT < 0.5; oversampled 
0.6< WT <1; — undersampled. (32) 


It is convenient to introduce an auxiliary parameter S = (x2(t)) 
representing the total power of the quantity x,(t) at the output of the 
filter o/(f) (Fig. 2); from (10) 


Ww 
s=x fo poplar (33) 


For the present maximum-gain antenna, (31) substituted into (33) 
yields 


S= = XW. (34) 


Then with the substitution of (34), (27) through (30) yield, for 
interpolation: 


a, a pee — 
0 


2 = se ee 
(d3(t)) = 28 S/N reer 2 wr” 
"3 8/N 
1\ 2Wwr 
— v)2 cas pea Za? oe) 
ea ( re w7) *3 a 
+ 3S er? or dy, (85) 
min(1, +--1) (1 —y)? + beget) 2m 
WT 3 S/N 
and for restoration: 
. 1 
Wr min (1, 3-1) 1 
(GD) = 28 Sry J, ame ike 
"3 S/N 
PAs Se eee 
1 ~~ WT) “3 S/N 
+ 3S dy. (36) 
nin (1, 2-1) (1 — y)? + eee "2 WT 
WT 3 S/N 


Here we combine the two cases of (32). S/N is the observed signal-to- 
noise power of the samples in Fig. 2, and WT is the parameter bounded 
by (82), indicating the relative degree of over- or undersampling and 
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consequent aliasing. The first terms of (35) and (36) consist of noise 
and distortion. In the oversampled case the second terms are zero; in 
the undersampled case the second terms contain aliasing as well. 

While it is possible to evaluate (35) and (36) in closed form, the 
results are messy. Moreover, such evaluations for other more realistic 
antenna patterns than the present uniform illumination (e.g., the 
truncated Gaussian illumination considered in the following section) 
will be worse in this regard. Consequently, these integrals have been 
evaluated numerically, with results given in Figs. 4 and 5, showing the 
average deviation power versus sampling parameter for interpolation 
and for restoration, respectively, with observed signal-to-noise ratio 
as a parameter. We note that the deviation is worse for restoration 
than for interpolation, because in the former case we attempt to 
equalize the input filter /(/), thereby enhancing the noise. Alterna- 
tively, Figs. 4 and 5 may be obtained by direct numerical integration 
of (18), using (6), (19), (27), and (29). Analytical expressions for these 
results in the oversampled case, 0 < WT < 0.5, are given in eqs. (77) 
and (78) of Appendix C. 

For the undersampled case, WT > 0.5, we require in addition the ac 
component of the deviation powers. We write 


Ill 


(d2(t)) = (2) — D, cos 2x =, (37) 
Ww Ww Ww 


where D, and Dy are the amplitudes of the ac components of the 
deviation powers for interpolation and for restoration, respectively. 


UNIFORM ILLUMINATION 
OPTIMUM FILTER 





<dp2(t)> 





0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 


Fig. 4—Average deviation power versus sampling parameter, with signal-to-noise 
ratio as a parameter, for interpolation of a bandlimited function: uniform illumination 
and optimum filter. 
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Fig. 5—Average deviation power versus sampling parameter, with signal-to-noise 


ratio as a parameter, for restoration of a bandlimited function: uniform illumination 
and optimum filter. 
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Fig. 6—Diagram of ac deviation power versus sampling parameter, with signal-to- 
noise ratio as a parameter, for interpolation of a bandlimited function: uniform illumi- 
nation and optimum filter. 


We determine D, and Dy from this definition (37) by combining (6), 
(18) through (20), (27), and (29). Numerical integration of the resulting 
expressions yields the results shown in Figs. 6 and 7. 

A partial check on these results is obtained by observing that for 
zero noise optimum interpolation must recover the filtered input with 
zero error at the sample points, i.e., x,(n7T) must be reconstructed 
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UNIFORM ILLUMINATION 
OPTIMUM FILTER 


DylS 





0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 
WT 


Fig. 7—Diagram of ac deviation power versus sampling parameter, with signal-to- 
noise ratio as a parameter, for restoration of a bandlimited function: uniform illumina- 
tion and optimum filter. 


without error for N = 0. Consequently, the S/N = © curves of Fig. 4 
and Fig. 6 must coincide, and they do. 

Observe that while H,(/f) of (27), the optimum filter for interpola- 
tion, decreases monotonically as f goes from 0 to W, the same is not 
true for Hw(f). In contrast Hw(f), the optimum filter for restoration, 
initially increases to a maximum before finally dropping to zero at 
f = W. For example, consider the oversampled case, WT' < 0.5. Then 


a: 
WwW 
Hy(f) = 7 ff 2Wr (38) 
( 7 £) 3 S/N 
Obviously, 
Hy(0) = 2 WT’ Hy(W) = 0. (39) 
1+ 3 S/N 


The peak is given by 


-o5\/35N 4 fay- vee 
Hyw(f) | max ak 0.5 9 WT at WwW 1 3 S/N’ (40) 


For large S/N the peak of Hw(f) is very large, and very close to 
f= W. However, the results of Fig. 5 and 7 remain finite as S/N > ~. 
D, and Dy of (37) contain contributions from the second and first 
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equations of (18), respectively, and from the third equation of (18), 
used together with (20). The former consist of linear distortion and 
aliasing, the latter of noise. All such contributions (to the ac coeffi- 
cients D) are 0 for the oversampled case, WT < 0.5. For the undersam- 
pled case, 0.6 < WT < 1, distortion and aliasing make positive 
contributions to D, and Dy, while noise makes negative contributions 
to these quantities. Taking note of the negative sign on the last term 
in (37), the deviation due to distortion and aliasing is worst between 
the sample points, while the deviation due to noise is worst at the 
sample points. 

The deviation powers in Figs. 4 through 7 have been normalized to 
the signal power at the output of °/(f) in Fig. 2a, S = (x2(t)), given 
for the present case by (34). This is perfectly appropriate for Figs. 4 
and 6 (“interpolation”), where we wish to reconstruct the quantity 
xX,(t). It is less appropriate for restoration, where we wish to reconstruct 
xw(t), the bandlimited version of the input x(t) in Fig. 2a, given by 
(4). Here a more appropriate normalization might be the total power 
of xw(t). For the present case (31), (10) and (34) yield 


(x2(t)) = 2XW = 38. (41) 


To normalize Figs. 5 and 7 (“restoration”) in this way, simply divide 
the numbers on the vertical axes by 3 and relabel these axes accord- 
ingly. 

The parameter S/N of Figs. 4 through 7 is appropriate for both 
interpolation and restoration, since it is the signal-to-noise ratio 
observed at the sampled output of a radiometer used to measure 
incoherent fields. However, observe that S/N defined here, and used 
throughout the remainder of this paper, is different than the conven- 
tional signal-to-noise ratio at the output of a radiometer receiver. In 
the present work S is proportional to the fluctuation in the radiometer 
signal output as the antenna is scanned across the sky; while the 
conventional radiometer signal output is taken as the average signal 
output as the antenna is scanned across the sky. As we noted in 
Section II, we assume the average radio brightness is deterministic, 
and we treat it separately (see Appendix A). 


VI. GAUSSIAN ILLUMINATION 
Let a one-dimensional antenna of width W have an aperture field 
that is Gaussian: 


ze 10-4-(4) Ix] < > 
E(x) = o (42) 
0, |x| > 2° 
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The field at the edge of the aperture is d dB down from the maximum 
field (at the center of the aperture); 


d = —20 logio# (=) F (43) 


The symbol d represents the aperture “taper”. 

AW (f) of (3) and Fig. 2, the Fourier transform of the effective width 
A(t) of the antenna, is proportional to the convolution of E(f) of (42) 
with itself; by (1) 


min (F247) 
f E(x)E(x — f)dx 


(EB 
max cae f 


A i (44) 


For d = 0, (42) substituted into (44) yields (31) of the preceding section 
for uniform illumination. 

Figures 8 through 11 give the average and ac deviation powers versus 
sampling parameter with signal-to-noise ratio as a parameter, for 
interpolation and for restoration, for Gaussian aperture illumination 
with taper d = 15 dB. These results may be compared with Figs. 4 
through 7, respectively, for uniform illumination, treated in the pre- 
ceding section. As in this prior case, numerical integration seems 
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Fig. 8—Average deviation power versus sampling parameter, with signal-to-noise 
ratio as a parameter, for interpolation of a bandlimited function: Gaussian illumination 
with 15-dB taper, optimum filter. 
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Fig. 9—Average deviation power versus sampling parameter, with signal-to-noise 
ratio as a parameter, for restoration of a bandlimited function: Gaussian illumination 
with 15-dB taper, optimum filter. 
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Fig. 10—Diagram of ac deviation power versus sampling parameter, with signal-to- 
noise ratio as a parameter, for interpolation of a bandlimited function: Gaussian 
illumination with 15-dB taper, optimum filter. 


preferable to analytical treatment. Equation (44) is evaluated using 
(42), the results substituted into (27), (29), and (18), and finally (6) 
and (37) are evaluated using (19) and (20). 

Much of the discussion of the preceding section for uniform illumi- 
nation applies also to the present Gaussian case. For zero noise the 
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Fig. 11—Diagram of ac deviation power versus sampling parameter, with signal-to- 
noise ratio as a parameter, for restoration of a bandlimited function: Gaussian illumi- 
nation with 15-dB taper, optimum filter. 


S/N = © curves of Figs. 8 and 10 again coincide. To normalize the 
“restoration” results to the total power of xyw(t) (rather than to 
S = (x2(t))), (10) and (42), (44) yield 


(x3y(t)) = 2xW = 3.2818, d = 15 dB; (45) 


simply divide the numbers on the vertical axes of Figs. 9 and 11 by 
3.281 and relabel these axes accordingly. 

An important difference between the results for uniform illumina- 
tion (discussed in the preceding section) and the present results for 
the more practical Gaussian illumination with 15-dB taper appears in 
the “restoration” results of Figs. 5 and 9 for these two cases, respec- 
tively. Restoration is much more difficult in the present case because 
the input filter o/(f) in Fig. 2 falls off much faster with f, requiring 
greater equalization by the output filter and hence yielding more 
output noise. 


Vil. SUBOPTIMUM FILTERS 


We have so far considered only optimum reconstruction filters, 
according to Section IV, for both interpolation and for restoration. 
Such filters must be changed for each different signal-to-noise ratio 
S/N. We now explore the use of fixed filters, for interpolation and for 
restoration, which are independent of S/N. 

Consider interpolation first. The optimum filter H,(/) of (27) is well 
behaved and, in particular, changes only slightly as the noise power N 
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increases from zero. It is natural to use H,(f) for N = 0 as a subopti- 
mum filter for finite but small N, i.e., for large but finite S/N. 

The situation for restoration is quite different. The optimum filter 
Hw(f) of (29) is badly behaved, as we discussed in connection with 
(38) through (40); small changes in S/N can produce large changes in 
Hy(f) near its peak. For the oversampled case, 0 < WT < 0.5, Hw(f) 
for N = 0 has a pole at f = W, and this filter therefore yields infinite 
output noise for finite S/N. Consequently, use of Hyw(f) for N = 0 for 
finite S/N is restricted to the undersampled case, 0.5 << WT <1. 

Figures 12 through 15 show the average deviation power for inter- 
polation and for restoration, for uniform illumination and for Gaussian 
illumination with a 15-dB taper. In all cases the suboptimum filter 
used is the optimum filter for S/N = ©. For restoration, only the 
undersampled case, 0.5 < WT < 1, is shown, as we discussed above. 

Observe that the curves of Figs. 12 and 14 (suboptimum interpola- 
tion, for uniform and for Gaussian illumination, respectively) are 
identical in the oversampled case, 0 << WT < 0.5. Here the reconstruc- 
tion filter is simply 

1 |fl<W 


H(f) = 46 
(f) 0. W>w (46) 


Thus, since aliasing is absent, in the absence of noise the suboptimum 
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Fig. 12—Average deviation power versus sampling parameter, with signal-to-noise 


ratio as a parameter, for interpolation of a bandlimited function: uniform illumination, 
suboptimum filter equal to optimum filters for S/N = &. 
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Fig. 13—Average deviation power versus sampling parameter, with signal-to-noise 
ratio as a parameter, for restoration of a bandlimited function: uniform illumination, 
suboptimum filter equal to the optimum filter for S/N = », 
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Fig. 14—Average deviation power versus sampling parameter, with signal-to-noise 
ratio as a parameter, for interpolation of a bandlimited function: Gaussian illumination 
with 15-dB taper, suboptimum filter equal to optimum filter for S/N = ». 


(i.e., zero noise) reconstruction filter simply passes the input without 
distortion. The output deviation then results from noise alone. A 
simple result is given in (79), Appendix C, for this case. We can 
observe further comparing (77) and (79) that the S/N = 10 curves in 
Figs. 4 and 12 start out with identical slope for small WT, but that as 
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Fig. 15—Average deviation power versus sampling parameter, with signal-to-noise 
ratio as a parameter for restoration of a bandlimited function: Gaussian illumination 
with 15-dB taper, suboptimum filter equal to optimum filter for S/N = ~. 


WT increases the curve of Fig. 4, for the optimum interpolation filter, 
falls below the curve of Fig. 12, for the suboptimum interpolation 
filter. 

We conclude that for interpolation with large S/N, we may use the 
reconstruction filter designed for S/N = © with little loss. This is not 
true for restoration. These results are to be expected from the discus- 
sion of the optimum filters given above in these two cases. The ac 
deviation powers for interpolation do not change much from those of 
Figs. 6 and 10, so we omit plots of them here. 

For interpolation with smaller S/N, the greatest penalty for the 
suboptimum filter occurs near critical sampling, WT = 0.5. For ex- 
ample, compare Figs. 4 and 8 with Figs. 12 and 14, respectively, for 
S/N = 10; the suboptimum filter is about 35 percent worse for uniform 
illumination and about 67 percent worse for Gaussian illumination 
with a 15-dB taper than the optimum filters for these cases. 


VIII. DISCUSSION 


The present results show how to process data obtained by measure- 
ment of a one-dimensional incoherent field with a one-dimensional 
antenna, and what the resulting errors will be. We assume the present 
results provide some indication for the real, two-dimensional case. 

In the model of Fig. 2, T'is the sampling interval and N is the noise 
power. By the definitions in Section II, T corresponds to the angular 
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separation between antenna observations, and N to the mean-square 
error in receiver output due to receiver noise. The noise of a radiometer 
receiver is inversely proportional to the observation time; moreover, 
if a given time is allotted to measure a given region of the sky, the 
time per observation is inversely proportional to the number of obser- 
vations, or directly proportional to the angular separation between 
observations. Consequently, 


Constant 


as Observing Time per Unit Angle of Sky 


(47) 


where the constant in (47) is independent of the antenna. 

Let us examine the data of Figs. 4, 5, 8, 9, 12, or 14 for fixed 
observing time per unit angle, i.e., from (47) for NT = constant. 
Consider as a specific example Fig. 4. Take the point given by S/N = 
100, WT = 1; then (d2(t))/S ~ 0.187. The measurement parameters 
S/N = 10, WT = 0.1 yield the same value of NT, and hence by (47) 
the same observing time per unit angle of sky; for these parameters 
(d2(t))/S = 0.017. In this example, sampling ten times as often with 
1/10 the signal-to-noise ratio has reduced the mean-square deviation 
by a factor of 0.187/0.017 ~ 11. More generally, for average deviation 
in any of the above figures, compare: 


1. Any S/N = 10 curve. 
2. The corresponding S/N = 100 curve with its horizontal scale 
compressed by a factor of 10, i.e., each point x, y replotted to x/10, y. 





In every case the compressed S/N = 100 curve will lie above the 
S/N = 10 curve for WT > 0.05. For 0 s WT < 0.05 the two curves 
coincide precisely, as we see in the second paragraph of Appendix C. 

From this, we conclude that undersampling is always bad; it will 
always be better to reduce T and increase N, i.e., to take more closely 
spaced observations each with poorer signal-to-noise ratio, to avoid 
operating in the undersampled region. Moreover, in the preferred 
oversampled region the family of curves in each of these figures can 
be combined into a single curve; i.e., for 0 < WT s 0.5 the mean- 
square deviation (d?(t))/S is a function of the single normalized 
variable [(S/N)/(2WT)]. These results are in agreement with a prior 
observation of R. W. Wilson. 

Recalling from Section VII that Figs. 12 and 14 are identical for 
0 < WT < 0.5, only five curves are required to summarize the data of 
Figs. 4, 5, 8, 9, 12, and 14 in the oversampled region. This is done in 
Figs. 16 and 17, for interpolation and for restoration, respectively. 
Note that the vertical axis of Fig. 16 has been normalized to S = 
(x2(t)) of (33), the same as Figs. 4, 8, 12, and 14. However, the vertical 
axis of Fig. 17 has rather been normalized to (xjy(t)), as discussed in 
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Fig. 16—Root-mean-square deviation for interpolation versus signal-to-noise ratio in 
the oversampled case, 0 <= WT < 0.5. 
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Fig. 17—Root-mean-square deviation for restoration versus signal-to-noise ratio in 
the oversampled case, 0 = WT < 0.5. 


(41) and (45), ie., different than the normalization for Figs. 5 and 9. 
For a given antenna and a given receiver noise temperature, the total 
observing time for a given area of sky is proportional to the product 
of the area observed and the horizontal axis variable [(S/N)/(2WT)] 
of Figs. 16 and 17; however, observe from (33) and (47) that the 
constant of proportionality depends on the antenna illumination, and 
hence differs for the uniform and for the Gaussian cases. 

Finally, we observe that while undersampling (WT > 0.5) is bad, 
oversampling (WT < 0.5) offers no advantage over critical sampling 
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(WT = 0.5). Reducing WT below 0.5 requires more data storage, with 
no reduction in error. 

Let us examine the implications of these results for ordinary anten- 
nas. Consider the antenna of Section VI, with Gaussian illumination 
(42) and significant taper (e.g., d = 15 dB, as in the examples). To 
determine the gross structure of the main lobe we may neglect the 
truncation of the aperture field in (42), i.e., assume E(x) is given by 
the top expression of (42) for all x. Then the effective width is the 
Fourier transform of (44) with infinite limits on the integrals: 


Pe 107 - Taio WO" Aye 
A(t) = W Jin 10° , Wt< 10° d>1. (48) 


Observations are frequently taken at an angular separation of one full 
3-dB beamwidth; i.e., the receiving power patterns at two adjacent 
observations overlap at their 3-dB points. In our present model this 
corresponds to a sampling interval 


T= 2ts3-aB; (49) 
where ¢3.ap is the 3-dB half-width of the antenna pattern (48); 


A(tsan) = 5-A(0). (50) 


From (48) through (50) 


Woek= - eee “ me (51) 


For the 15-dB taper chosen for the examples, (51) yields 
W.2t3.an = 0.985, d = 15-dB taper. (52) 


Thus, measurements separated by a full 3-dB beamwidth with a 15- 
dB antenna taper will be undersampled by about a factor of 2, with 
corresponding penalties indicated in Figs. 8 and 9. 

Table I illustrates the above discussion. The same antenna size and 


Table |—Numerical examples 


Normalized rms Deviations 
Normalized 


n 
10 logy S/N d Ohaseving Interpolation Restoration 
WT (dB) (dB) Time Avg. Max. Min. Avg. Max. Min. 
1.0 20 15 1 0.24 0.33 0.095 0.72 0.83 0.60 
0.5 17 15 1 0.12 0.12 0.12 0.53 0.53 0.53 
0.5 57 15 10,000 0.12 0.12 0.12 
0.5 36 0 73 0.12 0.12 0.12 
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receiver noise figure are assumed for the four cases shown. A typical 
antenna, with Gaussian illumination and a 15-dB taper, is used for 
the first three examples. The final example uses a maximum-gain 
antenna, with uniform illumination (no taper). The root-mean-square 
(rms) deviations have been normalized to (x2(t)) = S for interpolation 
(33) and to (x%(t)) = 2XW for restoration (41), (45). In each case, 
from (47) 


; . Constant W S/N 
Observing Time = NT > Constant: S wr’ (53) 
with the third factor determined from the first and second columns of 
the above table, and S in the denominator of the second factor given 
by (45) for the first three examples and by (41) for the last example. 
Finally, the observing times are normalized such that the first two 
examples have normalized observing times equal to unity. 

In the top row observations are taken at twice critical separation 
(i.e., observations separated by about a full 3-dB beamwidth). Since 
the data are undersampled, the rms deviations vary, being minimum 
at the observation points and maximum half-way in between. The 
normalized deviation is much larger for restoration than for interpo- 
lation. 

The second row shows the same antenna and receiver with critical 
sampling. Twice as many observations are made, each for half the 
time, with the signal-to-noise ratio reduced by 3 dB; hence, the total 
observing time for a given area of the sky is the same as for the first 
row. The rms deviations are now independent of position with respect 
to the observation points. The rms interpolation deviation is about 
half as large, and the rms restoration deviation is about three-fourths 
as large, as the corresponding average deviations for the undersampled 
case, in row 1. The deviation for restoration remains much larger than 
that for interpolation. 

The third row shows the same antenna and receiver as the first two 
rows, with a 40-dB higher signal-to-noise ratio than the second row; 
consequently, the total observing time is increased by a factor of 
10,000. The deviations are greatly reduced, that for restoration now 
being equal to the interpolation deviation for the preceding case of 
row 2. 

Finally, the fourth row shows that a maximum-gain antenna, with 
uniform illumination (no taper), and the same receiver as that of row 
3, will attain the same rms restoration deviation with a total observing 
time of only 73, as compared to 10,000 for an antenna with Gaussian 
illumination and a 15-dB taper (row 3). Of course, the observing time 
is still large compared to that of row 2; i.e., to obtain the same rms 
deviation for restoration with an antenna with uniform illumination 
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as for interpolation with an antenna with a 15-dB taper takes 73 times 
as long. 

The above discussion makes undersampling appear unattractive. 
Nevertheless, much existing data have been taken in this way; the 
present results show the best way to process such data, and the 
resulting errors. The optimum filters (27) and (29) in the undersam- 
pled case minimize the time-average mean-square deviations (d°(t)) 
in the present model (6). It is shown in Appendix D that they also 
minimize the time-dependent mean-square deviations (d?(t)). In the 
corresponding antenna problem the optimum data spatial filters min- 
imize the mean-square error everywhere in the region containing the 
observed points. 


IX. CONCLUSIONS 


Consider astronomical measurements of radio brightness that has 
white spatial variation (i.e., that varies rapidly compared to the beam- 
width of the antenna used to make the measurement), made at a 
regular array of points in the sky. Optimum mean-square estimates 
for the measured and true brightness at any point in the sky are called 
“interpolation” and “restoration”, respectively. Idealize this problem 
to one dimension. Then: 

1. Data points should be separated by about half the (full) 3-dB 
beamwidth for a normal antenna, having a tapered aperture illumi- 
nation; i.e., the receiving power patterns at two adjacent observations 
should overlap at their 0.75-dB points. This is often not done. 

2. Interpolation can be accomplished with reasonable accuracy and 
reasonable observation time. 

3. Restoration with reasonable accuracy requires much longer ob- 
servation time than interpolation. 

4, Optimum spatial filters depend on the signal-to-noise ratio. For 
interpolation this dependence is very weak. The optimum filter for 
zero noise works fairly well for interpolation of finite signal-to-noise 
ratios; this is not true for restoration. 

5. A maximum-gain antenna, with uniform aperture illumination, is 
better for restoration than a conventional antenna with tapered ap- 
erture illumination. 

The following additional studies are suggested by the present work: 

1. Interpolation and restoration with a finite number of data points, 
perhaps not regularly spaced (e.g., edge effects). 

2. Tolerances in the spatial filters applied to the measured data, and 
in the antenna illumination. 

3. Interpolation and restoration with reduced resolution, by addi- 
tional spatial filtering. 
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4. Nonwhite sky brightness statistics, e.g., strong isolated point 
sources embedded in white brightness. 

5. Treatment of the real two-dimensional problem. Greater variety 
is evident, e.g., square, hexagonal, and irregular sampling patterns are 
of interest in two dimensions. 
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APPENDIX A 
Condition for DC Component To Be Deterministic 


A real wide-sense stationary random process has mean (x(t)) and 
covariance ¢,(7) = (x(t + 7)x(t)), both independent of t. We assume 


lim ¢,(7) = a’. (54) 
|r| 
Then define ¢,,(7) by the relation 
p:(7) =a’ + px,(7). (55) 
Thus, 
lim (7) = 0. (56) 
[7 | 
The spectral density P,(f), the Fourier transform of ¢,(7), is by (55) 
P,(f) = a76(f) + P;,(f), (57) 
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where P,,(f) is the Fourier transform of ¢,,(r) of (55). By (56), P..(f) 
contains no component proportional to 6(f), ie., P,.(f) contains no 
delta function at the origin. The dc power of x(t) is thus a’. 

Now define 


T 
=) = lim oF { _ (edt (58) 
as the dc component of an individual noise wave x(t). We have 
(x(t)) = (x()). (59) 
Next, 
2 1 T T 
(x(t) ) = lim On? { f- $,(t — s)dt ds = a’. (60) 


Define x,-(t) by the relation 


x(t) = x(t) + xaclt). (61) 
Then 
(Xac(t)) = 0, (62) 
o:(r) = a” + oy,,.(7). (63) 
If we compare (55) and (63), 
bx,(T) = $z,(T). (64) 


We investigate the conditions under which x(t) = (x(t)) with 
probability 1, i.e., for which almost every x(t) has the same dc com- 
ponent. Define 


y = x(t) — (x(t). (65) 
Obviously (yv) = 0. Now, 


__ 2 
(y?) = (x(t) ) — (x(t))? = a? — (x(t), (66) 


the last step following from (60). 
Therefore, if 


ae ox(r) = (x(t))*, (67) 
then with probability 1 
ae 1. [7 
x(t) = lim oT J_r x(t)dt = (x(t)). (68) 


The dc component may then be treated as a fixed quantity, and the 
ac component [whose spectrum is the second term of (57)] estimated 
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separately. A stationary shot noise, for example, satisfies (67). Note 
that in (67) 


lim ¢,(7) =i) P3(f)df. (69) 


|r |- 


APPENDIX B 
Derivation Of Time-Varying Deviations 


The time average of the expected error powers (5) and noise power 
are given immediately in terms of their power spectra, (9) and (12), 
by (11) and (13). However, the time-varying error and noise powers 
(18) are not so easily obtained. All three quantities of (18) are derived 
in a similar manner; we choose the middle one, (e2(t)), as represent- 
ative. 

Use the Fourier series representation for the random process x,(¢), 
as follows:’ 


Xo(t) = y Xonelr2mhot (70) 
In|<N 
WwW 
=— 71 
fo = 
Pee 
lim = (XonXim) = 0, nm. (72) 
fo0 fo 
. dL 
lim = (| %on|*) = P:,(nfo). (73) 
fo—0 fo 


The restriction on the summation in (70), with N defined in (71), 
arises from (14). From Fig. 2, the error e,(t) of (5) is given by 


e(t)= Y (Hn — 1)xXonei?* 
[n|<N 


oF ™ 2 FH, nt+K xenernnt 
—-N<n<0 
poe 
te TT Y An-KXone?™", (74) 
0<n<N 
where 
Hsia an (75) 
n= O/> AT : 


The restrictions on the summations in (74) arise from (14), (15), and 
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(17). The first line of (74) represents linear distortion; the second and 
third lines represent aliasing. If we combine terms in (74), 


ee in2rfot 
e(t)= SS le THsx~1+ Holtone 


—N<n<0 


ae: 
# je Hee ete. - 76) 
0<n<N 
where we ignore the n = 0 term by the assumption that any dc 
component is deterministic and treated separately. Calculating (e2(t)) 
using (72) and (73), we obtain the middle relation of (18). The other 
two relations are similarly obtained. 


APPENDIX C 
Some Analytical Results 


Consider the results (35) and (36) for a one-dimensional antenna 
with uniform illumination in the oversampled case, 0 < WT < 0.5. 
Here min (1, 1/(WT) — 1) = 1; aliasing is absent, the upper limit of 
the first terms in (35) and (36) is 1, and the second terms in these 
relations are absent. Then these results are readily evaluated by partial 
fraction expansions of the integrands to yield the following results for 
optimum reconstruction filters, for interpolation and for restoration, 
respectively: 


cen) = 25 Ef, \/2 an \/2 | cm 


ae Ay PWT eur) oN 
(dw(t)) = 3S 3 S/N tan 5 WT (78) 


These relations give the curves of Figs. 4 and 5 for 0 < WT < 0.5. 
We omit similar, but messier, results for 0.6 < WT < 1. Observe that 
the normalized deviations (77) and (78) are functions of only (WT’)/ 
(S/N). In the oversampled case we can reduce the sampling interval 
and signal-to-noise ratio proportionately; i.e., in Figs. 4 and 5 for 0 < 
WT < 0.5, the points with equal y-coordinates on the S/N = 100 and 
S/N = 10 curves have x-coordinates WT whose ratio is precisely 10. 
This simple property does not hold if either x-coordinate WT > 0.5. 

We have no such analytic results for Gaussian illumination, Figs. 8 
and 9. However, R. W. Wilson has pointed out that in the absence of 
aliasing, halving the sampling interval and doubling the sample noise 
power must leave the final result unaltered. As a result, in the over- 


and 


2028 TECHNICAL JOURNAL, NOVEMBER 1984 


sampled case, 0 < WT < 0.5, the normalized deviations must be 
functions of only (WT)/(S/N); in particular, the S/N = 100 and 
S/N = 10 curves of Figs. 8 and 9 for 0 < WT’ < 0.5 must scale in the 
same way as described above for Figs. 4 and 5. This is most readily 
seen from present results by observing that for 0 < WT'< 0.5 all terms 
AW (f+ 1/T) in (27) through (30) disappear, and the noise and sampling 
interval appear only as the product NT. Thus the deviations remain 
unchanged as long as NT is fixed, i.e., if the number of samples and 
the signal-to-noise ratio are multiplied by the same factor. 

Finally, we have an extremely simple result for the suboptimum 
filter for interpolation in the oversampled case, for both uniform and 
Gaussian illumination. Here H(f) is given by (46). There is neither 
aliasing nor linear distortion, and substituting (46) into the third 
relation of (18) yields 

pine 
(d(t)) =S: S/N’ (79) 
The linear relation (80) yields the curves of Figs. 12 and 14 for 0 < 
WT < 0.5. Again the normalized deviation depends only on (WT)/ 
(S/N). 


APPENDIX D 


Optimum Filters Minimize Time-Dependent Mean-Square Deviations 


The optimum linear estimator for x,(t) of Fig. 2 (i.e., “interpolation”) 
may be written as® 


y(t) = » a,(t)[x (kT) + n(RT)], (80) 
where x,(kT) and n(kT) are the signal and noise samples of (8); the 


coefficients a;, necessarily functions of time, are selected to minimize 
the mean-square deviation d2,(t), where 


d,(t) = y(t) — x(t). (81) 
The a,(t) satisfy® 
({xo(t) — Y ag(t)[xo(RT) + n(kT)]}-{x(iT) + n(iT)}) = 0. (82) 


k 
Equation (82) yields 

6,,(€ — iT) — x a;(t)¢,,((k — 1)T) + N-a,(t) = 0, (83) 
where N is the power of the (independent) noise samples (Fig. 2) and 


$x,(7) is the covariance of x,(t), i.e., the Fourier transform of P,,(f) of 


(1). 


The index i and the summation index k in eqs. (82) and (83) range 
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over the set of samples used to estimate x,(t). While this set may be 
finite, the present work assumes an infinite number of samples are 
used. Then it is immediately obvious by the stationarity of x,(t) that 


a,(t) = a(t — kT) = a(t — kT), (84) 
where we drop the subscript 0 as unnecessary. Substituting (84) into 
(80), the optimum estimator for x,(t) is 


y(t) = y [x(RT) + n(kT)]a(t — kT), (85) 
k=—o 
where a(t) is given by 
b.(t) = ¥ b,(kT)a(t — kT) + N-a(t). (86) 
k=—0 


If we Fourier transform (86), use the Poisson sum formula, and solve 
for the transform of a(t), 


A() = (87) 


— k 
72 Pa (tA) +N 


Then, if we substitute (3) into (87), comparing (8) with (80), we 
identify a(t) with T-h(t), and hence replace A(f) by T times H,(f) of 
(27), to yield 


| (fF) 1? 
k 
1-3) 
Finally, imposing the constraint (17) and recalling that .°/(f) is strictly 
bandlimited to | f| < W [Fig. 2 or (3) and (14)], (88) becomes identical 
to (27). 

We recall that in the undersampled case (0.5 << WT'< 1) the optimum 
interpolation filter H,(f) of (27) minimized the time-average mean- 
square deviation (d2(t)) of (6). The present appendix shows that 
H.(f) also minimizes the time-dependent mean-square deviation 
(d*,(t)) for all t. 

A similar discussion may be given for restoration; the optimum filter 
Hyw(t) of (29), which minimizes (d%(t)), also minimizes (diy(t)) for 
all t. 


Hf) = —= 
» 


k=—0 


; 88 
"NE (88) 
xX 
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To characterize the transmission performance of the public switched net- 
work, the Bell System conducted an intensive field measurement study of end- 
office-to-end-office connections from October 1982 to January 1983. A special 
multistage sampling plan and ASPEN, a flexible, automatic data acquisition 
system based on the UNIX™ operating system, were developed to evaluate 
the more than 10,000 transmission paths included in the study. This paper 
describes both the ASPEN measurement system and the sampling plan. A 
companion paper in this issue of the AT&T Bell Laboratories Technical Journal 
describes the network transmission performance results. 


I. INTRODUCTION AND SUMMARY 


The Bell System traditionally conducted field measurement studies 
of network transmission performance to evaluate existing administra- 
tive and maintenance procedures and to set objectives for new trans- 
mission systems and equipment.’""’ The 1982/83 End Office Connec- 
tion Study (EOCS) had additional goals arising from Computer Inquiry 
II and the 1984 divestiture of the Bell operating companies from 
AT&T. Specifically, the EOCS was intended to support allocation of 
performance objectives across segments of the network; to aid planners 
of new telecommunications systems, services, and terminal equipment; 
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and to serve as a benchmark against which new system arrangements 
and services might be compared. The EOCS and allied studies now 
provide a database to support work on both the traditional and 
divestiture-related goals. 

To conduct the EOCS, a software-controlled field measurement 
system called ASPEN (Automatic System for Performance Evaluation 
of the Network) was developed, a special sampling plan was prepared, 
and software tools were adapted to allow for rapid screening and 
analysis of the measurement data. This section presents an overview 
of ASPEN and the EOCS sampling plan. The accompanying paper 
describes the study results in detail, along with a description of the 
data analysis methods.” 


1.1 The ASPEN Data Acquisition System 


The most recent field measurement study comparable to the EOCS 
is the 1969/70 Connection Survey.’ In that study, manually operated 
measurement equipment was carried from office to office for a period 
of one year, and a total of about 600 transmission paths were evaluated. 
(A given test connection offers two directions of transmission for 
testing, hence two transmission paths. In the 1969/70 connection 
survey, measurements were made on one transmission path per test 
connection.) Measurement data were entered manually on a terminal 
connected to a central computer. While this procedure worked well, a 
highly automated data acquisition system was deemed desirable for 
the EOCS for two fundamental reasons. 

First, setting performance objectives requires accurate information 
about the tails of performance distributions, including distributions 
conditioned on end-office switch type, time of day, mileage band, and 
other criteria. It was estimated that a sample size of 10,000 transmis- 
sion paths was needed to meet the study goals. The time and cost of 
manual data acquisition would have been prohibitive. 

Second, a modifiable and reusable measurement tool appeared to be 
the most efficient way to meet future demand for field performance 
measurements of inter- and intraexchange network segments, as well 
as a host of planned new services. Progress during the 1970s in 
automatic measurement equipment and mini/microcomputer hard- 
ware and software had made the development of such a tool possible. 

As Fig. 1 shows, the ASPEN application to the EOCS consisted of 
20 remotely controlled instrumentation packages (called Remote Test 
Units or RTUs) connected to the line side of mainframes in selected 
Bell System end offices. Under the control of a host computer, the 
RTUs called one another over the public switched network just as 
actual customers would. Once a connection between two RTUs was 
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Fig. 1—ASPEN System: End Office Connection Study. 


established, automatic measurement equipment in the RTUs evalu- 
ated more than 25 transmission parameters on that connection (see 
Table I). 

The ASPEN structure offered several advantages over the 1969/70 
approach, the most notable of which was speed. Evaluating more than 
120 transmission paths per day, the equivalent of the 1969/70 data 
was gathered in five days as opposed to a full year. In all, over a period 
extending from October 1982 to January 1983, parameters were mea- 
sured on more than 10,000 paths from over 7,000 test connections. (In 
the EOCS, approximately 3000 of the 7000 test connections were 
evaluated in both directions, while the remaining connections were 
evaluated in one direction only. Thus the total number of transmission 
paths evaluated was approximately 2(3000) + 4000 = 10,000.) 

A second advantage offered by the ASPEN structure was flexibility. 
As described in Sections II and III, the sequence of operations at an 
RTU was controlled by a microprocessor whose software was down 
loaded from the host computer. The measurement sequence could be, 
and was, changed as needed. 

Finally, because the data screening and preliminary analysis ran 
concurrently with data acquisition, information was available to make 
changes in the data acquisition process while the study was still in 
progress. The analysis tools also made it possible to obtain final results 
quickly, despite the much larger volume of data. 

A particularly challenging aspect of ASPEN development was the 
preparation of host computer software that could handle fault condi- 
tions gracefully. The ASPEN host computer for the EOCS was a 
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Table I—1982/83 End office connection study parameters measured 


Woike Insertion loss (1 kHz) 
mand poss vs. frequency (30 freqs) 
: -message noise 
noe teed Call cutoffs 
Propagation delay 


Signal-to-noise ratio 
C-notched noise 
3-kHz flat noise 
3-kHz notched noise 
Noise-to-ground 
Envelope delay vs. frequency (30 freqs) 
Peak-to-average ratio (P/AR) 
2nd-order intermodulation distortion 
8rd-order intermodulation distortion 
Voiceband Phase jitter (2-300 Hz) 
Data Phase jitter (20-300 Hz) 
Amplitude jitter (2-300 Hz) 
Amplitude jitter (20-300 Hz) 
Impulse noise (6 thresholds) 
Gain hits (3 thresholds) 
Phase hits (3 thresholds) 
Dropouts 
Frequency shift 
1200-b/s bit/block error rate 
4800-b/s bit/block error rate 


VAX-11/780* computer running under the UNIX operating system. 
As described in Section III, ASPEN’s host computer control software 
consisted of three distinct layers responsible for call management, 
connection supervision, and overall connection scheduling. This struc- 
ture successfully handled the occasional dropped connection, inter- 
mittent RTU failure, or host computer service interruption (both 
scheduled and unscheduled). As a result, ASPEN could run virtually 
unattended on a daily basis. 

In addition to providing a friendly environment for program devel- 
opment, the UNIX operating system offered data storage and analysis 
tools to permit data screening and analysis to run concurrently with 
data acquisition. A relational database package and the S statistical 
analysis package (created at AT&T Bell Laboratories) were a key part 
of the ASPEN system and the EOCS. Together with special C language 
data screening programs, these tools were used to ensure data integrity, 
to store the data in a compact, logical structure, and to allow rapid 
exploratory analysis of results. Because the data screening programs 
accepted data directly from the RTUs, manual handling of measure- 


ment data was completely eliminated. 


1.2 The end office connection study sampling plan 


The sampling plan included both RTU location (spatial) and test 
* Trademark of Digital Equipment Corporation. 
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connection scheduling (temporal) components. As described in Section 
IV, RTU locations were selected to yield performance data represent- 
ative of different mileage bands, end office switch technologies (ESS™, 
crossbar, and step-by-step switching equipment), and other strata. A 
unique aspect of the EOCS spatial sampling plan was that the sampled 
units, end offices, were quite different from the units being evaluated, 
namely, telephone calls. 

With 20 RTUs deployed, there were 380 possible originating/ter- 
minating pairs over which calls could be placed. The EOCS temporal 
sampling plan, operating through the ASPEN Scheduler software, 
managed the selection of the RTU pairs. A unique feature of the 
temporal sampling plan algorithm was its ability to enhance the 
number of busy-hour connections and to handle gracefully the addi- 
tions or deletions of RTUs caused by equipment problems. 


7.3 Summary 


The ASPEN approach to acquisition of network performance data 
proved highly successful; data were gathered more than 20 times as 
fast as by previous manually oriented methods. The ASPEN host 
computer software structure shows how a fault-tolerant automatic 
performance characterization system may be implemented, and the 
ASPEN spatial and temporal sampling plans show how locations may 
be chosen and measurements scheduled to evaluate a network or other 
telecommunications service. 

Sections II and III provide a detailed description of the hardware 
and software design of ASPEN as it was applied to the EOCS. While 
the components used in ASPEN are explicitly listed, the hardware 
description is a generic one because there are many possible choices 
for RTU and host computer components. The software description in 
Section III is more explicit. While the actual code is not presented, 
the structure is described in some detail, with emphasis on fault- 
handling capabilities. 

Section IV describes how the a priori study requirements and the 
opportunities offered by the fully automated ASPEN approach com- 
bined to create special sampling challenges. Also described is an 
iterative preselection step that made it possible to meet the need for 
stratification of results by mileage band and end-office switch tech- 
nology (ESS, crossbar, or step-by-step switching equipment). 


Il. ASPEN HARDWARE SUBSYSTEMS 


The ASPEN data acquisition system is composed of two major 
hardware subsystems: the remote test unit and the host computer. 
This section describes the hardware components used in each of these 
subsystems. 


CONNECTION STUDY = 2037 


2.1 Remote test unit hardware 


The Remote Test Unit (RTU) subsystem has five major hardware 
components: a microcomputer that controls the operation of the RTU, 
a remotely controlled transmission Impairment Measuring Set (IMS), 
“test” data sets used in conjunction with a bit and block error rate 
testing capability, and a switching matrix. Table II lists the specific 
equipment used for the EOCS, and Fig. 2 shows a complete RTU, 
ready for shipment and installation. 

Figure 3 is a schematic representation of the ASPEN data acquisi- 
tion system. The host computer and RTU microprocessor communi- 
cate by means of 1200-b/s full-duplex data sets operating over an 
ordinary dial-up connection through the public switched network. 
Using the switching matrix, the microprocessor connects the line 
under test, IMS, test data sets, Bit Error Rate Receiver (BERR), and 
Bit Error Rate Transmitter (BERT) in the various configurations 
required to measure the parameters listed in Table I. 

The IMS measures all the parameters in Table I except bit/block 
error rate and propagation delay. Selection and testing of the IMS, an 
especially critical phase of the ASPEN project, was guided by AT&T 
PUB 41009, a technical reference on the evaluation of transmission 
impairment measurement equipment. The instrument chosen for 
ASPEN (see Table II) struck the compromise among performance, 
cost, and availability that was most appropriate for the EOCS. The 


Table II—ASPEN remote test unit components 


Transmission impairment measuring set 
Hekimian Laboratories Inc. Model 3701 Communications Test System with EIA 
RS232C Remote Control Option 


Microprocessor 
Colorado Data Systems 538A Smart Hardware System* 
Card Complement: 
1. Zilog Z80 microprocessor, two RS232 input/output ports with buffered input and 
output 
2. Three relay cards (ten relays per card) 
3. Bit error rate receiver card 
4, Bit error rate transmitter card 
5. Counter card (four counters per card) 


Data modems 

1. For host-RTU communications: 
Western Electric 212AR 

2. For bit and block error rate measurements: 
Western Electric 212AR 
Racal Vadic 3450 
Western Electric 208B-L1B 
Codex 5208R 


Equipment case 
Environmental Container Systems, Inc. fibercase enclosure (2 ft x 2 ft x 4 ft) with 
shock-mounted rack (see Fig. 2). 


* Trademark of Colorado Data Systems. 
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Fig. 2—ASPEN RTU. 


RTU instruments were remotely self-checked for accuracy throughout 
the data-gathering period. 

As is common practice, ASPEN measures bit and block error rates 
by transmitting and receiving a repeated pseudorandom bit stream 
with the BERT and BERR equipment. The data sets used in the 
EOCS (see Table II) were widely used in the network and readily 
available at the time of the study. All data sets were pretested in the 
laboratory to ensure that none exhibited performance aberrations 
absent in other sets of the same type. 

ASPEN uses an extension of the error rate measurement technique 
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Fig. 3—Schematic representation of the ASPEN instrumentation. 


to measure round trip propagation delay. As Fig. 4 shows, an error is 
deliberately injected into the transmitted bit stream at the near-end 
BERT. The bit stream containing the error is transmitted by a 1200- 
b/s full-duplex data set, and the error is detected by the far-end BERR. 
This event is used to trigger injection of an error into the 
far-end BERT, which is transmitted by the data sets back to the near 
end BERR. The elapsed time between near-end error injection at the 
BERT and detection at the BERR (minus a predetermined constant 
to allow for data set delays) is the round trip propagation delay. 


2.2 Host computer hardware 


The ASPEN system host computer consists of a mini- or microcom- 
puter capable of running the UNIX operating system. Table III lists 
the host computer hardware used in the EOCS. The number of input/ 
output (I/O) ports and amount of primary memory (i.e., random 
access) required depend on how many RTUs are to be controlled 
simultaneously. In addition, enough secondary memory (i.e., disk 
subsystems) is required to support the installation of a database large 
enough to store the data to be collected. If the RTUs are not connected 


2040 TECHNICAL JOURNAL, NOVEMBER 1984 


TELEPHONE 
1200-b/s NETWORK 1200-b/s 
DATA SET DATA SET 


BERR BERT BERR BERT 


ERROR ERROR ERROR ERROR 
DETECT INJECT DETECT INJECT 


FAR-END RTU 


NEAR-END RTU 





Fig. 4—Round-trip propagation delay measurement. 


Table III—ASPEN/end office connection study host hardware 


Equipment 
Components Used 
CPU VAX-11/780 with battery backup and DEC floating 


point accelerator 
Primary memory 3.75 M-bytes random access memory 


Secondary Three DEC RMO5 removable disk ‘duivene 256M bytes each 
memory 
Tape backup One DEC TU77 high-speed tape drive 
1/0 ports Four DEC DZ11 8-channel asynchronous multiplexers, providing 
32 ports, assisted by two DEC KMC11B auxiliary 
microprocessors 
Dial-out Four DEC DN11-DA Automatic Calling Unit (ACU) Controllers, 


and four WE 557A ACU-sharing arrangements. 


directly to the host computer, Avtomatic Calling Units (ACUs) or 
their equivalent must be present to dial up all the RTUs. 


Ill. ASPEN SOFTWARE SUBSYSTEMS 


In the early planning stages, a decision was made to utilize the 
UNIX operating system for all aspects of the study. The UNIX system 
provided an excellent software environment for such assorted tasks as 
formulation and testing of statistical plans, development of ASPEN 
system software, collection and analysis of data, and generation of 
reports. 


3.1 Remote test unit software subsystem 


The RTU software communicates with the host computer while it 
simultaneously controls the RTU hardware. The RTU contains an 
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operating system that accepts down-loaded programs and commands 
from the host computer, executes subroutines in these programs, and 
exchanges data with the host. 

The RTU software locally controls the various hardware functions 
by dividing the program into subroutines. Each subroutine contains 
an option that allows two or more subroutines to be linked. 

In the EOCS the RTUs measured performance parameters of Direct 
Distance Dialing (DDD) connections. This involved the use of the 
RTUs in pairs. Because the RTU memory is limited, a separate 
program was needed for the originating end and the terminating end 
of the connection. Therefore, two RTUs testing a particular DDD 
connection had complementary subroutines that were initiated by 
“start” commands from the host computer. The subroutine timing was 
designed to be robust enough to accommodate a plus-or-minus 3- 
second error in the individual starting times of the complementary 
subroutines. 

The complementary subroutine functions used in a typical EOCS 
sequence are shown in Table IV. 


3.2 RTU-host interface 


The interaction between the host computer and the RTU is based 
on the concept of a software module. A module is defined as a set of 
instructions at the host computer that causes execution of a subroutine 
at the RTU, and passes the RTU information necessary to execute 
the subroutine. There exists a one-to-one correspondence between 


Table I!V—Subroutine functions from typical measurement sequence 


Originating Terminating 
RTU RTU 

Dial RTU test connection to far end Answer 
Measure analog parameters Send test tones 
Send data to host Send data to host 
Send test tones Measure analog parameters 
Send data to host Send data to host 
Dial far-end RTU reference connection Answer 
Measure envelope delay Send test tones 
Send test tones Measure envelope delay 
Send data to host Send data to host 
Connect 1200-b/s data set Connect 1200-b/s data set 
Measure propagation delay Establish error return loop 
Send data to host — 
Measure bit/block error rate Measure bit/block error rate 
Send data to host Send data to host 
Connect 4800-b/s data set Connect 4800-b/s data set 
Transmit 4800-b/s data Measure bit/block error rate 
— Send data to host 
Measure bit/block error rate Transmit 4800-b/s data 


Send data to host 
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host computer modules and RTU subroutines. The concept of modules 
will be covered more thoroughly in a later section. 

From the point of view of the host computer, executing a module 
means conversing with the RTU microprocessor, exchanging infor- 
mation with it, and starting the execution of a subroutine at the RTU. 
Related tasks include sending and retrieving data from the RTU and, 
in general, executing any of the RTU built-in operating system com- 
mands. The RTU microprocessor operating system gives the host 
immediate feedback about the disposition of a command at the RTU. 
This feedback consists both of echo checks to ensure the integrity of 
commands transmitted to the RTU, as well as status checks to make 
sure that the subroutines have been executed correctly. Whenever a 
subroutine finishes executing, the RTU generates a status symbol, 
which verifies that the subroutine is completed. 

The host computer also synchronizes the two RTUs engaged in 
testing a transmission path. By commanding execution of modules at 
both RTUs simultaneously, any critical timing relationships between 
corresponding subroutines at the different RTUs can be maintained. 

Another mechanism used in the host-RTU interaction is the gen- 
eration of checksums. Whenever a program is down loaded from the 
host computer to the RTU, the RTU generates a unique checksum, 
which is used by the host computer to verify the integrity of the 
program. In addition, transmission of data from the RTU is imple- 
mented in an error-free fashion by the use of parity checks and 
character counts, with retransmissions in case of errors. 

The simplest way for the host and the RTU to communicate is by 
maintaining a communication link open between them for the duration 
of a measurement sequence (e.g., a dial-up connection). However, with 
more sophisticated programs available for the RTUs, the need for this 
communication link diminishes, and it suffices to poll the RTUs 
periodically to retrieve data from them or to reinitialize them. This 
significantly reduces transmission costs. 


3.3 Host computer software 


The remainder of this section contains detailed descriptions of the 
structure and the major components of the ASPEN host software. 
Wherever necessary, specific references to the EOCS implementation 
of the host software are made, but the software structure is presented 
in a generic form. For ease of readability, program names are shown 
in typewriter face and file references are presented in italics. The 
index n represents the number of physical RTUs in the ASPEN study. 
As Fig. 5 shows, there are three principal layers of ASPEN control 
and communication software: 

1. Layer 1 contains n call programs, each providing a connection 
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Fig. 5—ASPEN host software configuration. 


from the host computer to one RTU, as well as a library of I/O 
functions used to interact with it. 

2. Layer 2 contains n/2 supv programs, each responsible for con- 
trolling and synchronizing one pair of RTUs, by interacting with the 
subordinate pair of call programs. 

3. Layer 3 consists of a single scheduler program, responsible for 
implementing the testing schedule for the study, as well as for dealing 
with the real-time constraints of the host computer system. 


3.3.1 Call 


Call is the base software layer and backbone of the ASPEN system. 
It provides communication with the RTU in a robust, error-tolerant 
fashion. The call program is based on a standard UNIX system 
utility, the cu (call UNIX) command. The cu command is a general- 
purpose software tool that enables the user to establish a telephone 
connection from the host computer to a remote computer system 
(based on the UNIX system or some other system). The cu command 


2044 TECHNICAL JOURNAL, NOVEMBER 1984 


provides a full-duplex environment to the user by splitting itself into 
two parts, one part for each direction of transmission. It manages an 
interactive conversation with the remote system and allows file trans- 
fers in either direction. The call program differs from cu in that it 
operates in half-duplex mode, requiring only a single process to run.* 
It is also designed specifically to converse with an RTU,' and it is not 
interactive, but takes its commands from a command file. A benefit 
of a system such as the UNIX system is that resources spent devel- 
oping a single program can be exploited in multiple invocations of the 
same program. For the EOCS, 20 invocations of the call program 
ran simultaneously. 

One call process exists for each working RTU in the ASPEN 
system. Upon its execution, call establishes a control link between 
the host and the RTU.* It then checks that the connection quality 
exceeds a minimum standard, and enters a quiescent state in which it 
waits for a signal to proceed. Each call process is assigned an index, 
which is used thereafter to identify it to various files with which it 
interacts. For example, process call(i)notifies the operator of its 
status by depositing information in the file Trace(i). Similarly, 
call(i) will reads its instructions from the file Control(i). There are 
four I/O files each call process interacts with: Control, Data, Status 
and Trace. 

The Control file contains the instructions to be executed at any 
given time. In the case of the EOCS, these individual instructions are 
written in a format denoted CDSLANG, or CDS Language, an inter- 
mediate interpretive language based on the built-in commands of the 
CDS 53A hardware controller, the microprocessor-based controller 
used in each RTU. However, the call program isolates this language 
in such a way that other formats could be designed to control different 
RTUs. A collection of CDSLANG instructions, bound together to 
perform a discrete function, is denoted a module. Modules, which are 
stored as simple UNIX system files, are named according to the 
function they cause the RTU to perform. For example, a module 
named Reset might logically accomplish the software resetting of a 
component of the RTU. 


* The UNIX operating system is a multitasking operating system, wherein a process 
is an image of a software program, loaded in core memory, with its own data and stack 
segments and a possibly shared text segment. Multiple invocations of a single program 
can result in multiple processes resident in core memory. Since the system operates in 
half-duplex mode, fewer processes reside in core memory, and a substantial reduction 
in CPU load can be achieved. 

tFor the EOCS, the call program contained code specific to the CDS 53A hardware 
system controller. 

*The control link for the EOCS consisted of a telephone connection using 1200-b/s 
modems. 
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To execute a module, that is, get a call process to execute the 
instructions comprising the module, it must be copied into the Control 
file, and the call program must be signaled to begin. The call 
program will wake up and begin executing the CDSLANG instructions 
until it successfully completes all of them, or until one fails. A failure 
may be due to transmission difficulties between the host and the RTU, 
or to RTU hardware problems. The outcome of the module execution 
is reported to both the Trace file and the Status file. More than 50 
modules were developed for the EOCS, using as building blocks more 
than 30 CDSLANG instructions. . 

A Trace file exists to store the current status of each call process 
whenever it changes. By utilizing an appropriate display program, an 
operator may monitor RTU actions. 

The Data file is a transient template file used to deposit, on a 
temporary basis, data that are retrieved from the RTU. From this file 
data are transferred to other, more permanent, files that are time- 
stamped to reflect the date and time of acquisition. 

The Status file is the I/O channel between the call program and 
the next higher layer, the supv program. This file holds information 
on the success/failure of modules treated as whole entities. No infor- 
mation is available as to which instruction within a module failed, 
merely that the module as a whole has failed. In addition, the Status 
file is used to report problems associated with the control link between 
the RTU and the host (e.g., a dropped telephone line). The call 
program itself is designed to take care of such situations and redial 
dropped connections, but the supv program makes sure that the 
second call program is suspended until the control link is reestab- 
lished. 


3.3.2 Supv 


The supv program constitutes the intermediate software layer in 
the ASPEN system. Its role is to supervise the operation of the two 
call processes beneath it, guiding them throughout the execution of 
a prearranged sequence of modules. The supv follows instructions 
contained in the modfile, a file specifying an ordered sequence of 
modules plus logical rules to follow in case the predefined sequence 
does not proceed normally. The program is able to monitor the actions 
of the two RTUs by analyzing the status information provided by 
call processes. The success of the supv may be quantified by the 
number of modules it guides the call processes through, within a 
specified time frame. 

In the case of the EOCS, a typical modfile specified a collection of 
30 modules to be executed sequentially, which would: 

1. Down load programs into each of two RTUs 


2046 TECHNICAL JOURNAL, NOVEMBER 1984 


2. Establish a test connection between the two RTUs 

3. Make a sequence of analog measurements on the test connection 

4, Perform bit and block error rate measurements on the test 
connection 

5. Transfer measurement data to the host computer 

6. Reset the RTU hardware and release the test connection. 

Upon invocation, the supv processes, one for each pair of call 
processes, would begin to execute the modules named in the modfile. 
One by one, a module would be copied into the Control files correspond- 
ing to the two call processes it was supervising, and the call 
processes would be awakened. Supv would then wait for the status of 
the module execution to be appended to the Status files. If execution 
was successful, supv would move on to the next module prescribed in 
the modfile. Otherwise, a set of logical rules specified in the modfile 
would direct supv through a sequence of actions ranging from repeat- 
ing the module to executing an alternate set of modules, or aborting 
the whole modfile. To prevent endless loops, each module has a 
maximum allowable execution time, or “time-out” constraint. For the 
EKOCS, eight principal modfiles controlled the measurement sequenc- 
ing. New modfiles are easily developed, a flexibility which makes it 
possible to change the direction of a measurement study or to generate 
special-purpose substudies. When operating all RTUs, 10 supv proc- 
esses supervised 20 call processes simultaneously, with data collec- 
tion taking place on 10 test connections. 

The supv processes take control of the call processes at the 
beginning of each new time period, defined in the next section. Each 
supv process is also assigned an index upon invocation, and only then 
finds out which two call processes it is to supervise. This information 
is conveyed by means of the Supuctl files. The Strace files also exist 
to store diagnostics.from the supv. 

Much as the success of the supv program is based on the outcome 
of individual modules, the progress of the scheduler program is 
based on the success or failure of whole modfiles. In the example 
above, a successful execution of the modfile would mean that the two 
RTUs involved successfully executed all the modules specified by the 
modfile on a test connection. A failed modfile execution would imply 
that a connection between the two particular RTUs needs to be 
rescheduled by the scheduler, since it has failed. 


3.3.3 Scheduler 


The scheduler program keeps track of the status of the entire 
measurement system. In contrast to call and supv, the scheduler 
program is not well defined; its implementation will vary depending 
on the criteria specified by a statistical sampling plan. It can range 
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from a simple control program, which repeatedly schedules a connec- 
tion between the same set of RTUs, to a complex set of procedures, 
which govern a large number of RTUs, assuring that certain combi- 
nations of tests occur at certain times of the day, with special consid- 
eration for a subset of important RTUs. For the EOCS, the sched- 
uler implemented a scheduling algorithm in accordance with the 
EOCS sampling plan. The aim was to ensure a uniform distribution 
of the number of measurements of each of the 380 possible RTU pair 
transmission paths (with 20 RTUs), while maximizing the number of 
RTU pair connections established during the busy hour(s) at the 
originating RTU location (nominally 10:00 to 11:00 a.m. and 2:00 to 
3:00 p.m., local time). 

In addition, the scheduler is responsible for making sure the 
study can accommodate whatever maintenance procedures exist for 
the host computer. The scheduler design provided that RTUs would 
not be active during a one-hour period per day scheduled a week in 
advance for maintenance of the host computer and stored in a file 
readable by the scheduler. By agreement, the computer servicing 
time occurred within a five-hour period before 5 a.m. daily. 

The scheduler also requires robustness; it must provide adequate 
procedures for recovering from unexpected system downtime cleanly. 
For the EOCS, the scheduler consisted of a table- and list-driven 
C program that worked in combination with several smaller shell 
scripts. (The shell, a high-level programming language and command 
interpreter, is a fundamental component of the UNIX operating 
system.) For less complex studies, simpler implementations of the 
scheduler are possible entirely in shell control language. 

For the EOCS, the concept of a run and a time period were devel- 
oped. A time period consists of the time necessary for a pair of call 
processes to execute the modules specified by the modfile under the 
control of the supv process, approximately 2.5 hours for the EOCS. 
A run consists of a full cycle of time periods such that all possible 
RTU pair combinations are tested. For the EOCS a run was 38 time 
periods with 10 simultaneous RTU-RTU connections per time period. 
The scheduler used the beginning of each time period to resyn- 
chronize the scheduling algorithm, compute the connections for the 
next time period, manipulate lists and tables necessary to implement 
the busy-hour requirement of the sampling algorithm, and deal with 
RTUs that were out of service. As soon as the list of connections for 
the next time period was available, the scheduler would deposit the 
information into each of n/2 files, labeled Supuctl(1) through 
Supuctl(m), where n is the number of RTUs in the study and m = n/ 
2. These files constitute the I/O channel to the intermediate supv 
level described above. 
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IV. END OFFICE CONNECTION STUDY SAMPLING PLAN 


As we noted in the Introduction, the use of ASPEN technology has 
engendered new statistical problems involving sampling and data 
analysis. In this section we examine sampling issues somewhat gen- 
erally. For the EOCS, sampling involves two components: choice of 
RTU locations and scheduling of calls in time. They are treated 
separately in this paper since they involve spatial and temporal con- 
siderations, respectively. Furthermore, the preselection method and 
the scheduling algorithm used to resolve these issues are general tools 
that may be used separately in other applications. Data analysis issues 
are examined in the companion paper.” 

The problem involving selection of RTU locations arises because 
sampling methods required to support the new measurement meth- 
odology do not fit the classical sampling context. In the classical 
context experimental units are items on which measurements are 
made, sampling units are items that could be included in the sample, 
and sampling and experimental units are the same. For the EOCS, 
experimental units are connections, defined by ordered pairs of 
end office switching machines. The sampling units are end office 
buildings and they are not the same as experimental units. The 
situation was further complicated since study goals required adequate 
representation of certain strata in the sample, and stratification vari- 
ables are defined in terms of both experimental and sampling units. 
For example, strata defined by airline mileage (a property of a pair of 
end office switching machines and hence defined in terms of experi- 
mental units) and by originating switch type (defined in terms of 
sampling units) were required. 

A new method of sampling, herein called the preselection method, 
provides a general method for sampling networks under representation 
constraints. The method, an extension of the method of snowball 
sampling,’ is the subject of Section 4.1. 

As Section 4.2 describes, the scheduling problem involves determin- 
ing when to place calls between the various RTU pairs. An algorithm 
was developed that provides two outputs: a synchronous, clocked 
schedule of RTU pairs to be in conversation at any time, and a list of 
the end office switches to which RTUs are to be connected. (Where 
end offices contained more than one type of switch, the EOCS sam- 
pling plan specified connections using each switch type.) It also 
provides for stratification of calls by busy versus nonbusy hour. 


4.1 Remote test unit location sampling 


With 20 RTUs allocated to the EOCS* at least 380 basic experi- 


* Examination of data from Ref. 7 suggests that one deployment of 20 RTUs would 
be sufficient to achieve the goals of the study. 
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Table V—Sample representation requirements for the end office 
connection study sampling plan 


Dimension Stratum 


Mileage 0-360 miles* 
360-720 miles 
720-1320 miles 


Requirement 


At least 50 RTU pairs 
At least 80 RTU pairs 
At least 100 RTU pairs 


1320 + miles At least 110 RTU pairs 


ESS switching equipment At least 6 switches 

At least 6 switches 

At least 6 switches, 2 of which are 
Community Dial Offices (CDOs) 


Originating 
switch type Crossbar 
Step-by-step 


Facility Satellite At least 12 RTU pairs 
Terrestrial Remainder 
Region Long Lines region 3 RTUs per Long Lines region 


plus 2 in New York City 


* Airline miles 
t At the time of the EOCS sampling, Long Lines (now AT&T Communications) was 
divided into six administrative regions. 


mental units are defined by originating-terminating pairs.* These 
experimental units cannot correspond to sampling units since obtain- 
ing a sample of 380 experimental units by selecting 380 office pairs 
would lead to the use of (up to) 760 RTUs. Thus, sampling units must 
be buildings that house end office switches, and the experimental and 
sampling units do not correspond. 

As we noted, the sample was required to adequately represent 
various strata. Strata definitions and representation requirements 
were derived using 1969/70 Connection Survey data.’ The most im- 
portant strata variables are airline mileage and originating switch 
type. Airline mileage serves as an easily determined surrogate for route 
mileage, which is known to affect many parameters and is not easy to 
measure. Impulse noise is associated with step-by-step switches. In 
addition, strata were defined by geographic regions and by whether 
telephone connections between two locations could be carried by a 
satellite. Strata definitions and representation requirements are given 
in Table V. 

The preselection method, used in drawing a sample that meets the 
representation requirements noted above, is diagramed in Fig. 6. It 
features three steps: 

1. Creation of clusters of experimental units and definition of 
selection probabilities. 

2. The preselection step, the output of which is a set of clusters, 
guaranteed to produce an acceptable sample. 


* Many Bell operating company buildings house more than one switch, and the RTUs 
were designed for connection to up to three separate switches. Since experimental units 
are defined using switches rather than buildings, there were more than 380 such units 
in the EOCS. Multiple-switch buildings play a key role in sampling, as we will discuss. 
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Fig. 6—The preselection method. 


3. The selection step, in which the actual sample of experimental 
units is drawn. 

The preselection method is quite general. Creation of clusters admits 
great latitude. Selection probabilities need not be equal, and at each 
step sampling can be with or without replacement, with or without 
stratification, and so on. 

For the EOCS, clusters were created through definition of subre- 
gions, areas of about 10,000 square miles each, in each Long Lines 
(now AT&T Communications) administrative region. Clusters in- 
cluded all Bell System buildings within a given subregion. Selection 
probabilities were based on the number of subscriber lines served by 
a building and the type(s) of switches it housed. Three weights, one 
per switch type, were defined for each building: 


T, = number of lines served by 
building b, if switch type s is 
w(b, s) = housed in building 5, 


0, if s is not in b, where s = 
ESS, crossbar, step-by-step switching equipment. 


Selection probabilities were defined through normalization of weights 
throughout. They favor inclusion of buildings that house more than 
one switch type, because such buildings lead to comparisons of switch 
type performance that are not confounded with other variables. Subre- 
gion weights were calculated by summing over all buildings located 
within the subregion. 
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In preselection, subregions were sampled according to the following 
algorithm: 

1. Twenty artificial units, six each labeled ESS, crossbar, and step 
switching equipment, and two unlabeled, were defined. Starting in the 
Northeast, an artificial unit was drawn without replacement and a 
subregion sampled, again without replacement, using the subregion 
selection probabilities that correspond to the (switch) label of the 
artificial unit. 

2. Sampling proceeded in this manner until the appropriate number 
of subregions had been sampled in each region. There was one excep- 
tion to this rule: The two subregions in which Chicago and Orlando 
are located were included in the sample with probability one to help 
meet the satellite representation criterion (Table V). The label asso- 
ciated with the first artificial unit drawn in the Midwest and Southern 
regions was assigned to the Chicago and Orlando subregions, respec- 
tively. 

3. Except for the mileage criteria, the sampling scheme ensures that 
all representation requirements are met. These criteria were checked 
by assuming that the selected buildings would be located at the center 
of their respective subregions and by computing the resultant dis- 
tances. If a sample of subregions did not satisfy these representation 
criteria, that sample would be discarded and steps 1 through 3 repeated. 

Once the sample of subregions was accepted by the preselection 
step, the selection step was used to produce an actual sample of 
buildings. Building selection probabilities corresponding to the switch | 
types assigned each region were used in this step. 

Locations of sampled buildings are shown in Fig. 7. 


4.2 Test connection scheduling algorithm 


When RTUs are in place and operating, the ASPEN system is 
capable of collecting data nearly continuously. Calls placed between 
RTUs simulate calls that could be placed by customers; they are not 
sampled from that population, however. This has implications for data 
analysis. 

For the EOCS the following goals were adopted: 

1. Minimize measurement equipment idle time 

2. Maximize the number of busy-hour calls placed and ensure that 
both busy- and nonbusy-hour calls are made between all RTU pairs 

3. Place and receive calls through all switches connected to each 
RTU with equal frequency 

4, Be robust to RTU failure. 

The scheduling algorithm described herein works whenever n, the 
number of RTUs, is even, though it is described here as used in the 
EOCS (n = 20). 
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Fig. 7—Measurement equipment locations for end office connection study. 


The algorithm specifies a synchronous, clocked schedule. That is,. 
at the beginning of a measurement sequence, ten connections (of the 
380 possible) are established and a predetermined span of time (the 
time period) is allotted for completion of test sequences. Thirty-eight 
time periods define a run, during which a call on each ordered RTU 
pair is established once and only once. The basic elements of the 
algorithm are: 

1. The RTU pair table—This table consists of all 380 possible 
ordered pairs of 20 symbols, grouped in 38 sets, with each symbol used 
once and only once in each set. When the scheduling algorithm is 
implemented, symbols will be assigned to RTUs and sets to time 
periods. Construction of this table is discussed below. 

2. The busy-hour table—This table gives the busy hour for each 
RTU and, as the experiment proceeds, the number of busy-hour 
connections established to date. 

3. The switch definition table—This table specifies the switches 
through which calls are established for each RTU pair based on the 
run number. 

At the start of each run: 

1. RTUs are randomly assigned to symbols in the RTU pair table 
to provide 38 sets of 10 RTU pairs each. 
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2. Each of the 38 sets is assigned a time period. First, when the 
busy-hour table is being used, sets associated with RTU pairs on which 
few busy-hour calls have been placed are preferentially assigned time 
periods that correspond to the busy-hour for those pairs. Eventually, 
such assignments can no longer be made and the remaining sets are 
randomly assigned time periods. 

3. Finally, the switches through which RTU pairs are to make each 
connection are determined by consulting the switch definition table. 

Thus far, the algorithm provides a complete schedule of calls, but 
provides no method of protecting itself from intermittent RTU failure. 
To do so, the algorithm specifies that at the start of each time period 
lists be made of out-of-service RTUs and idled RTUs (mates of RTUs 
out of service). Connections that cannot be established, including 
designated switch pairs, are then added to a “calls missed” list. Finally, 
idled RTUs are used to establish previously missed connections. 

The RTU pair table is the cornerstone of the scheduling algorithm 
since it establishes an efficient mechanism for specifying study test 
connections among the various RTU pairs. Creation of such a table is 
nontrivial, and so we describe the method used for the EOCS. This 
method starts with a Latin square,’* but is not completely general in 
that Latin squares exist that do not yield an RTU pair table. The 
authors speculate that restriction to a special class of Latin squares 
would lead to a method of full generality. 

The method is described below for the general n-RTU case, n even, 
and is illustrated using n = 4 throughout. 

1. Start with a Latin square of size n, using symbols 0, ..., n-1. 
Attach row and column letters in such a way that zeroes are (i, i) 
entries. 

EXAMPLE: For a four-RTU experiment, there are twelve RTU 
pairs and two pairs can be connected in each time period. Six time 
periods will be required to connect all RTUs in all possible pair 
combinations. An augmented Latin square is: 


oom Fa’) 
WOnrF © 
OW bd 
re bo OO 
oor 


2. Group the (row, column) ordered pairs that correspond to each 
nonzero entry of the Latin square. 
EXAMPLE (continued): For the entry 1 we obtain: 


(a, d), (d, a), (b, c), and (c, b). 
3. Split each group into two subgroups such that each symbol 
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appears in each subgroup once and only once. This yields the RTU 
pair table. 
EXAMPLE (continued): For the group above we obtain: 


{(a, d), (b, c)} and {(d, a), (c, b)}. 
Bach bracketed set corresponds to RTU pairs over which measure- 
ments will be made during one of the six time periods in each run of 
the four-RTU experiment. Note that these groups need not be unique. 
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A comprehensive systemwide field study, referred to as the 1982/83 End 
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October 1982 through January 1983 to characterize the transmission perform- 
ance of the predivestiture Bell System public switched telecommunications 
network. Analog voice and voiceband data transmission parameters were 
measured on about 6500 direct-distance-dialing connections among 20 end 
office buildings located throughout the continental United States. The analog 
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sponse; envelope delay distortion; intermodulation distortion; phase jitter; 
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This paper presents the results of the EOCS data analysis; a companion paper 
describes the measurement equipment and the sampling plan. The perform- 
ance characterization information presented in this paper updates the similar 
information provided by a survey conducted in 1969/70. The results represent 
the last predivestiture Bell System network performance characterization and 
may serve as a benchmark for the end-to-end performance in the post- 
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I. INTRODUCTION 


Bell Laboratories undertook a comprehensive systemwide field study 
from October 1982 through January 1983 to characterize the trans- 
mission performance of the predivestiture Bell System public switched 
telecommunications network. This study, hereinafter referred to as 
the 1982/83 End Office Connection Study (EOCS), employed special 
measurement equipment, referred to as ASPEN (Automatic System 
for Performance Evaluation of the Network). Analog voice and 
voiceband data transmission parameters were measured on 6141 Di- 
rect-Distance-Dailing (DDD) connections among 20 end office build- 
ings located throughout the continental United States; in addition, 
another 395 connections were measured among four pilot study loca- 
tions at the start of the study. (Several end office buildings visited in 
the EOCS had multiple end office switches. When multiple switches 
were measured in the same building, they were individually identified 
in the EOCS database.) Measurements were typically made in one 
direction on each test connection. However, measurements were often 
made in both directions and sometimes repeated for stationarity 
studies, resulting in over 9000 1004-Hz loss measurements, for exam- 
ple. 

The ASPEN equipment consisted of 20 Remote Test Units (RTUs) 
(one per sampled end office building) under the control of a computer 
located at Holmdel, New Jersey. Each RTU was connected to the line 
side of the main distributing frame in its central office and contained 
a microprocessor, a transmission impairment measuring set, and mo- 
dems for communication with the central computer and for data 
performance tests. The transmission impairment measuring set (HLI 
3701 Communications Test Set) was designed to meet the require- 
ments specified in AT&T PUB 41009’ for measuring the impairments 
described in AT&T PUB 41008.” | 

Analog parameters measured on the connections included loss, 
noise, frequency response, envelope delay distortion, intermodula- 
tion distortion, phase jitter, amplitude jitter, Peak-to-Average Ratio 
(P/AR), frequency shift, propagation delay, and transient phenomena 
such as impulse noise, gain and phase hits, and dropouts. Error rates 
of voiceband data sets were also measured for 1200-b/s full-duplex 
and 4800-b/s half-duplex transmission. Table I contains a complete 
list of the EOCS measurement parameters. 

This paper presents the results of the EOCS data analysis; a com- 
panion paper’® describes the ASPEN measurement equipment and the 
sampling plan. The performance characterization information pre- 
sented in this paper updates similar information provided by the 1969/ 
70 Connection Survey.* Furthermore, with the 1984 divestiture of the 
Bell System, the results represent the last predivestiture Bell System 
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Table I—Parameter coverage 
Classification Measured Parameter 


Voice and voiceband data transmission 1004-Hz insertion loss 
Frequency response (30 frequencies) 
C-message noise 
Propagation delay 
Call cutoffs 


Voiceband data transmission Signal-to-C-notched-noise ratio 
C-notched noise 
3-kHz flat noise 
3-kHz flat notched noise 
3-kHz noise-to-ground 
Envelope delay distortion (30 frequencies) 
P/AR 


Second-order intermodulation distortion 
Third-order intermodulation distortion 
Phase jitter (2 to 300 Hz) 

Phase jitter (20 to 300 Hz) 

Amplitude jitter (2 to 300 Hz) 
Amplitude jitter (20 to 300 Hz) 
Frequency shift 

Impulse noise (six thresholds) 

Phase hits (three thresholds) 

Gain hits (three thresholds) 

Dropouts 


Data set error performance 1200-b/s bit error rate (two modems) 
4800-b/s bit error rate (two modems) 
1200-b/s block error rate (two modems) 
4800-b/s block error rate (one modem) 


network performance characterization and may serve as a benchmark 
for the end-to-end performance in the post-divestiture environment. 


lil. THE DATA ANALYSIS METHOD 


The primary goal of data analysis in the EOCS was to provide 
estimates of the distributions of the parameters for various strata of 
interest and for the. overall network. Estimation of the distribution 
was selected, since other measures such as means, variances, and 
quantiles are obtained as a function of the distribution. 

The sampling design of the EOCS involved a stratification according 
to two variables:* connection airline mileage and type of switch at the 
end office. An analysis of covariance (with switch type as factor and 
mileage as covariate) was carried out on each parameter to determine 
the effects of these two variables. Although the effects of mileage and 
switch type were often both significant at the 5-percent level, one 
effect usually dominated the other for the value of its F-statistic in 
the covariance model. Results are displayed as a function of the most 
relevant effect. 

When a display as a function of airline mileage is deemed necessary, 
the results are split into three categories: short (0 to 180 miles), 
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medium (181 to 720 miles), and long (>720 miles). These particular 
mileage bands were chosen so that the EOCS results could be compared 
with similar results from the 1969/70 Connection Survey.’ 

The number of measurements varies with the measured parameter. 
To give an idea of the number of measurements taken for different 
strata, the number of 1004-Hz loss measurements made in the EKOCS 
is shown in Fig. 1 as a function of airline mileage in 100-mile blocks, 
where the abscissa shows the midpoints of the 100-mile blocks (e.g., 5 
represents the 451- to 550-mile block). Table II gives the number of 
1004-Hz loss measurements in the EOCS for the short, medium, and 
long mileage categories, together with the number of different pairs of 
end office buildings evaluated in the EOCS. The estimated percentage 
of toll traffic’ in each of the three mileage bands is also listed. Table 
III shows the number of 1004-Hz loss measurements made in the 
EOCS for each of the three types of switch—electronic switching 
system (digital switch), crossbar, and step-by-step switches—in the 
measuring end office, as well as the percentage of toll traffic estimated 
in each of the three switch strata.’ 


Table Il—Stratification of data by connection airline mileage 


Number of 
Estimated Loss (1004- Number of Percent of 
Designa- Airline Route Mile- Hz) Measure- Nonordered Toll 


tion Mileage age ments Pairs of Sites Traffic 

Short 0-180 0-341 776 11 
8.1% 5% 77.8 

Medium 181-720 342-1064 3717 76 
38.8% 34.2% 13.4 

Long 721-2576 1065-3378 5088 135 
53.1% 60.8% 8.8 

Total 9581 222 


Table I|—Stratification of data by type of 
switch in the office where measurements were 
made 


Number of Loss 
(1004-Hz) Mea- Percent of Toll 


Designation surements Traffic 

Digital switch 4154 

43.3% 58 
Crossbar 

43.3% 30 
Step-by-step 1281 

13.4% 12 
Total 9581 
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Fig. 1—Number of 1004-Hz loss measurements versus connection airline mileage in 
100-mile blocks. 
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Fig. 2—The 1004-Hz loss versus connection airline mileage in 100-mile blocks. 
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Within each stratum, the measurements were considered self- 
weighted. Weights proportional to the traffic occurring in each stratum 
were used for calculation of results relevant to the whole network. A 
statistical model of components of variance was considered within 
each stratum for each parameter. The model led to the estimation of 
the variability of a parameter on a given pair of end office buildings, 
and of the variability between different pairs of end office buildings. 
These two variabilities, together with the means, are displayed in 
Table IV for all EOCS parameters other than the transient parameters. 
The variabilities are taken into consideration in the evaluation of the 
90-percent confidence interval around the mean in each of the three 
mileage band strata, also shown in Table IV. 

One goal of this paper is to characterize, where possible, the cus- 
tomer-premises-to-customer-premises performance, i.e., the end-of- 
fice-to-end-office connection plus two loops. However, the impairment 
contributions from customer loops to the overall connection are usu- 
ally negligible for most of the parameters discussed here. Therefore, 
the characterization information presented in this paper is, for the 
most part, based on the information available for end-office-to-end- 
office connections only, i.e., the EOCS results. However, for 1004-Hz 
loss, frequency response, and C-message noise, for which loops are 
significant, loop impairment contributions are concatenated to the 
end office to end office values, using a common source of information 
for the loop impairments, the 1980 Loop Survey.® 

To concatenate the loop effects for loss and noise, distributions for 
1224 predivestiture Bell System representative loops for 1004-Hz loss 
(measured at the main distributing frame) and C-message noise (mea- 
sured at the customer premises) were extracted from the 1980 Loop 
Survey. Loop loss values were available at only three frequencies from 
that survey, so loop frequency responses were calculated from the 
transfer characteristics of loops obtained from the record survey 
associated with the 1980 Loop Survey. The distribution of a parameter 
for customer-premises-to-customer-premises connections was then ob- 
tained by analytically concatenating (using discrete convolution tech- 
niques) the distributions of the parameter on the loops with the 
distribution of the same parameter on the end-office-to-end-office 
connections. For frequency response, this concatenation process was 
repeated at each frequency. 


tl. RESULTS 

All test signals were transmitted from the RTUs at —12 dBm. | 
3.1 1004-Hz loss 

A 1004-Hz tone at —12 dBm was applied to one end of the connec- 
tion, and the received level at the other end was measured, with the 
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difference in levels being defined as the loss of the connection. (If a 
tone of exactly 1000 Hz [an integer submultiple of the 8000-Hz 
sampling rate in time-division multiplex systems] were used, the loss 
readings would “bobble” by +/— 0.25 dB, the signal-to-C-notched- 
noise ratio readings would vary by +/— 5 dB, and the jitter readings 
would be erratic. For this reason, a 1004-Hz tone was used in the 
EOCS as the test tone for loss and as the holding tone for C-notched 
noise, jitter, and transients. [This is the frequency used in modern 
transmission maintenance systems.]) The means and standard devia- 
tions of the losses are summarized in Table V for the three mileage 
bands. For comparison, the loss results of the 1969/70 Connection 
Survey* are also shown in the same table. The results from both 
surveys show that the mean loss increases with mileage, as would be 
expected from the Via Net Loss (VNL) transmission plan. The VNL 
design of trunks calls for increasing trunk loss with distance to offset 
the subjective effect of the increasing echo path delay, until a mileage 
is reached where an echo suppressor or echo canceler is applied to the 
trunk. Comparison of the two surveys shows that the mean loss is 
about the same in both instances, but the standard deviation observed 
in the EOCS is substantially smaller than in the 1969/70 Connection 
Survey. 

Figure 2 shows “boxplots” of the loss versus airline mileage between 
the end offices in 100-mile blocks, with the same abscissa as that of 
Fig. 1. (See Fig. 1 for the number of measurements in each 100-mile 
block.) For each mileage block on the abscissa, the corresponding 
boxplot shows the variability of the measurements falling in that 
mileage block. The upper and lower boundaries of the “box” indicate 
the 75th and 25th percentiles of the distribution, and the line inside 
the box indicates the median. Therefore, the height of the box, referred 
to as the interquartile range, is a measure of variability in the distri- 
bution, and the deviation of the median line from the center of the 
box shows skewness of the distribution. The distance from the median 
line to the tip of the “whisker” is equal to 1.5 times the interquartile 
range if there are points falling outside the whisker. If all points fall 
within the range of 1.5 times the interquartile range on either side of 
the box, the tip of the lower (or upper) whisker coincides with the 
minimum (or maximum) value. 

Figure 2 shows an overall tendency of increasing loss with mileage 
of up to about 1200 airline miles, and a decline thereafter. This loss 
decline can be attributed to the application of echo control devices on 
trunks, for which the intertoll trunk losses are set to zero. There seems 
to be a substantial variability in loss for distances between 1200 and 
2200 miles. In this mileage region, connections with and without echo 
control devices are likely to be encountered, which could account for 
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Table 1V—Summary of means and variability measures for analog parameters on end-to-end toll connections 


Parameter 


Loss 


404 Hz 


1004 Hz 
2804 Hz 


EDD 


604 Hz 


2804 Hz 
P/AR 


Delay round trip 
No satellite 
Only satellite 


Short Mileage Band 


Mean 


0.6 
0.5 
1.0 


TON 
Dod 
+ I+ [+ 
RAH 
Bib 


8 
9) 
8 
853 + 198 us 
599 + 111 us 
87.5 + 3.0 


9.6 + 2.0 ms 


Standard Standard 
Devia- Devia- 
tion Be- tion 
tween Within 
Pairs Of Pairs Of 
End Of-_ End Of- 
fices fices 
1.07 1.37 
0.93 1.34 
1.72 2.00 
316 155 
175 149 
4.7 4.0 
3.6 2.2 


0.2 
0.1 
0.2 


go ~1 90 
(oi RS ro) 


75 
.29 
09 


I+ H+ 


1150 + 87 
701 + 45 


84.5 + 1.2 


16.0 + 0.8 


Devia- 
tion Be- 
tween 


Pairs Of 
End Of- 
fices 


393 
203 


5.3 


3.8 


Devia- 
tion 
Within 
Pairs Of 
End Of- 
fices 





259 
185 


5.5 


2.4 








as To) 
I+ He 
ooo 
ore 60 


2 
2 
2 


Oo ~1 60 
o> G0 


1337 + 101 
727 + 39 


84.2 41.3 
32.6 
20.5 


2 1.4 
0.5 + 3.4 


I+ I+ 


5 


Standard Standard 
Devia- 
tion Be- 
tween 
Pairs Of 
End Of- 
fices 


1.64 
1.24 
1.49 


571 
219 


me 90 
oO 





| Medium Mileage Band | Long Mileage Band 


Standard Standard 


Devia- 
tion 
Within 
Pairs Of 
End Of- 
fices 


1.48 
1.29 
1.55 


224 
161 


oo to 
mo 








All 
(Weighted) 


Mean 


~10) 00 
to WH 


10 
15 
97 


I+ I+ + 
Coo 
00 im on 
bp 


935 + 155 
624 + 86 


86.8 + 2.3 


12.5 + 1.6 


AGNLS NOILDINNOD 


£9072 


Noise 
C-message 


C-notch 


s/n 

Flat 

Flat notch 

Noise-to- 
ground 


Second-order 


Third-order IMD 


Jitter 
Amplitude 
2-300 Hz 


24.6 + 2.2 


dBrnC 


36.5 £1.1 


dBrnC 
34.9 +1.1dB 
43.9 + 2.6 dBrn 
46.2 + 1.4 dBrn 
52.0 + 3.6 dBrn 


53.3 + 1.8 dB 
53.4 + 2.5 dB 


2.8 + 0.5% 


2.6 + 0.4% 


5.5 + 1.9 deg. 
3.1 + 0.5 deg. 


0.8 
0.7 
3.0 
0.8 


1.1 
0.8 
2.2 
1.2 


28.5 + 0.5 
37.4+0.3 


52.2 + 0.6 
51.0 + 0.9 


3.2 + 0.1 
2.8 + 0.1 
6.4 + 0.5 
3.9 + 0.2 


0.4 
0.3 
2.1 
1.1 


MADw Woo 
Nr OOD (oe) for) 


5.5 
5.9 


1.3 
1.0 
2.8 
1.3 


31.4+0.5 


51.6 + 0.6 
50.2 + 0.8 


3.6 + 0.1 
3.2 +0.1 
7.6 + 0.4 
4.9 + 0.2 


2.7 


gi aaa ade 
im CO orb 


3.8 
4.9 


0.4 
0.4 
2.5 
1.3 


1.5 
1.1 
2.8 
1.6 


25.7 + 1.7 


2.9 + 0.4 
2.7. + 0.3 
5.8 + 1.5 
3.3 + 0.4 


Table V—Comparison of 1004-Hz loss from 1982/83 EOCS and 
1969/70 Connection Survey 


1969/70 Survey EOCS Survey 
Primary Secondary DDD 
Connection Standard Standard Standard 
Length (Air- Mean Deviation Mean si Deviation Mean Deviation 
line Miles) (dB) (dB) (dB) (dB) (dB) (dB) 
All 6.7 + 0.6 2.1 6.6 + 0.3 2.1 6.7 + 0.4 1.3 
0-180 6.5 + 0.7 2.0 6.4 + 0.4 2.1 6.5 + 0.6 1.6 
180-720 7.3 + 0.4 2.3 7.1+ 0.6 2.1 7.38+0.2 1.4 
721-2900 7.7+0.5 2.5 71.4+ 0.3 2.0 7.8+ 0.2 1.8 


the variability. Above about 2200 miles, the median loss is nearly 
constant at 6 dB and the variability is small, suggesting that most of 
the connections in that mileage region have an echo control device. 
The constant 6-dB loss observed is consistent with a picture of toll 
connections with one toll-connecting trunk at each end with about 3- 
dB loss, plus a long, zero-loss intertoll trunk with an echo control 
device. 

Figure 3 gives the Cumulative Distribution Functions (CDFs)* of 
loss for the three mileage bands. Most of the losses—75 to 90 percent, 
depending on the mileage category—are greater than 6 dB. This 
observation is consistent with typical toll connections consisting of 
the two toll-connecting trunks described above, plus zero to seven (but 
rarely more than two) intertoll trunks with VNL design losses ranging 
from 0.5 to 2.9 dB. The losses at the lower tail for the short and 
medium categories could have come from the connections with no 
intertoll trunks or with intertoll trunks with negligible VNL loss. The 
lower tail losses for the long category could have come from the 
connections with zero-loss intertoll trunks with echo control devices. 

Figure 4 shows the CDFs of 1004-Hz loss for customer-premises-to- 
customer-premises connections. These CDFs were derived by analyt- 
ically concatenating the 1004-Hz loss of the 1980 Loop Survey’ to the 
end office to end office 1004-Hz loss. Figure 5 shows the CDF of 1004- 
Hz loss for the loops used in the concatenation. 


3.2 Frequency response 


Frequency response, also referred to as loss-versus-frequency char- 
acteristic or attenuation distortion, is a measure of loss variation over 
the frequency band of a communications channel. It can be measured 


* All CDFs in this paper are plotted with an ordinate having the normal probability 
scale. A normal CDF will show up as a straight line on such a plot, and the vertical 
scale near the tails of the distribution will be expanded for greater readability. 
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Fig. 3—CDFs of 1004-Hz loss for the short, medium, and long mileage bands. 


with sinusoidal test tones or with the 50-percent amplitude-modulated 
test signal employed to measure the Envelope Delay Distortion (EDD) 
characteristic. The response of an averaging detector (specified for 
level measurement by Ref. 1) is the same for a sinusoidal tone as for 
a 50-percent amplitude-modulated signal. 

To characterize frequency response, loss was measured from 204 to 
3504 Hz. Losses at 204, 254, and 3504 Hz were obtained from a 
sinusoidal test tone, while the losses at the other frequencies were 
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Fig. 4—CDFs of customer premises-to-customer premises 1004-Hz loss. 
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Fig. 5—CDF of 1004-Hz loss on customer loops. 


obtained from level measurements of the envelope delay distortion 
test signal. 

Tables VI through VIII show the means, standard deviations, and 
selected percentiles of loss versus frequency relative to 1004 Hz for 
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Fig. 6—Man attenuation distortion relative to 1004 Hz for short, medium, and long 
mileage bands. 
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Table VI—Attenuation distortion relative to 1004 Hz: short 
connections 


Frequency Standard ee 

in Hz Mean Deviation 1% 10% 50% 90% 99% 

204 5.1 + 2.9 2.8 2.0 2.5 3.8 10.0 13.2 

254 3.3 + 1.9 1.9 0.9 1.4 2.8 6.4 7.4 

304 18+1.1 1.2 0.2 0.5 1.6 3.7 4.7 

404 1.1+ 0.4 0.7 —0.8 0.4 1.1 2.0 2.9 

504 0.7 + 0.2 0.4 0.1 0.2 0.7 1.3 2.0 

604 0.44 0.1 0.3 —0.6 0.0 0.4 0.9 1.3 

704 0.3 + 0.0 0.3 —0.4 0.0 0.3 0.6 1.2 

804 0.2 +0.1 0.2 —0.5 0.0 0.2 0.4 0.6 

904 0.1 + 0.0 0.3 —1.2 0.0 0.1 0.2 1.1 
1004 0.0 + 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
1104 —0.1+0.1 0.4 —2.3 —0.3 —0.1 0.1 1.3 
1204 —0.1+0.1 0.2 —0.6 —0.4 —0.1 0.0 0.4 
1304 —0.2 + 0.2 0.3 —-17 —0.5 —0.1 0.1 0.7 
1404 ~-0.2 + 0.2 0.4 —1.2 —0.6 —0.2 0.1 0.6 
1504 —0.2 + 0.2 0.4 —1.3 —0.6 —0.1 0.2 0.8 
1604 —0.1+0.1 0.4 —1.2 —0.5 —0.1 0.2 1.0 
1704 0.0 + 0.1 0.4 —1.3 —0.4 0.0 0.3 1.0 
1804 0.0 + 0.2 0.4 —1.5 —0.3 0.0 0.4 1.3 
1904 0.1 + 0.3 0.4 —1.2 —0.3 0.0 0.6 1.2 
2004 0.1 + 0.5 0.6 —1.4 —0.4 —0.1 0.9 1.5 
2104 0.2 + 0.7 0.7 —1.4 —0.4 —0.1 1.0 2.5 
2204 0.3 + 0.8 0.9 —1.4 —0.4 —0.1 1.4 3.3 
2304 0.4+0.9 1.0 —1.2 —0.4 0.0 1.9 3.4 
2404 0.6 + 1.1 1.1 —0.9 —0.3 0.1 2.3 3.5 
2804 nl ie mee | 2.6 —0.6 —0.1 0.3 5.8 10.0 
2904 2.0 + 3.2 3.0 —0.7 —0.2 0.4 6.8 11.4 
3004 2.3 + 3.5 3.3 —0.6 —0.2 0.5 7.3 12.3 
3104 3.1 + 4.2 4.0 —0.4 0.0 0.9 8.8 15.3 
3204 41+ 4.9 4.6 0.2 0.5 1.5 11.5 18.2 
3304 5.3 + 5.3 5.0 11 1.5 2.3 13.8 17.7 
3404 7.4 + 4.6 4.2 3.2 5.0 5.9 14.4 23.8 
3504 16.5 + 7.5 6.0 4.2 5.2 19.2 21.2 22.2 


short, medium, and long connections. In the tables, the measured 
frequencies above 304 Hz are at intervals of 10 Hz, except 2504, 2604, 
and 2704 Hz were omitted. Because of the wide use of the 2600-Hz 
idle-circuit tone for signaling, the presence of a signal with energy 
concentrated near 2600 Hz could inadvertently cause disconnection. 

Figure 6 presents the mean losses of Tables VI through VIII as a 
function of frequency for short, medium, and long connections. The 
losses between 2404 and 2804 Hz (not measured in the EOCS) were 
obtained by linear interpolation. 

For comparison, the mean losses relative to 1004 Hz from the 1969/ 
70 Connection Survey are also shown in Fig. 6. As we can see in the 
figure, the mean frequency response has improved since that survey. 
This improvement can be attributed to the increasing use of T-carrier 
to replace voice-frequency cable facilities in the toll-connecting portion 


CONNECTION STUDY 2071 


Table ViI—Attenuation distortion relative to 1004 Hz: medium 
connections 


u. 
Frequency Standard quantiles 

in Hz Mean Deviation 1% 10% 50% 90% 99% 

204 5.7 + 0.3 2.1 2.4 3.6 5.3 8.3 13.1 

254 3.4 + 0.2 1.4 1.0 1.9 3.2 5.2 8.2 

304 2.1+0.1 1.0 0.3 1.0 2.0 3.2 5.5 

404 14+0.1 0.7 0.0 0.7 1.4 22 3.3 

504 0.9+0.1 0.5 —0.1 0.4 0.9 1.5 2A 

604 0.6 + 0.0 0.3 0.0 0.3 0.6 1.1 1.6 

704 0.5 + 9.0 0.3 —0.1 0.2 0.5 0.8 1.3 

804 0.3 + 0.0 0.2 —0.1 0.1 0.3 0.5 0.9 

904 0.1 + 0.0 0.1 —0.2 0.0 0.1 0.2 0.5 
1004 0.0 + 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
1104 0.0 + 0.0 0.1 —0.4 —0.2 0.0 0.1 0.2 
1204 —0.1 + 0.0 0.2 —0.6 —0.3 —0.1 0.1 0.3 
1304 —0.2 + 0.0 0.2 -0.8 —0.4 -0.1 0.1 0.3 
1404 —0.2 + 0.0 0.2 —0.9 —-0.5 —0.2 0.0 0.3 
1504 —0.2 + 0.0 0.3 -0.9 —0.5 —0.2 0.1 0.3 
1604 —0.2 + 0.0 0.3 —1.0 —0.5 —0.2 0.1 0.4 
1704 —0.1 + 0.0 0.4 —0.9 -0.5 -0.1 0.2 0.6 
1804 —0.1 + 0.0 0.3 —1.0 —0.5 ~0.1 0.3 0.7 
1904 —0.1 + 0.0 0.4 —1.0 —0.5 -0.1 0.3 0.8 
2004 —0.1+0.1 0.6 -1.1 —0.5 —0.1 0.4 1.0 
2104 —0.1+0.1 0.6 -1.1 —0.6 —0.1 0.5 1.1 
2204 0.0 + 0.1 0.6 —1.1 —0.5 0.0 0.6 1.4 
2304 0.1+0.1 0.7 —1.0 —-0.5 0.0 0.7 1.7 
2404 0.3 + 0.1 0.9 —-1.0 —0.4 0.2 0.9 21 
2804 0.8 + 0.2 1.2 —0.7 ~0.1 0.6 2.0 5.4 
2904 1.0 + 0.2 1.1 —0.6 —0.0 0.7 2.5 4.4 
3004 1.2+ 0.3 1.3 —0.6 0.0 0.8 3.3 5.4 
3104 1.6 + 0.3 1.6 —0.5 0.2 1.1 4.5 6.8 
3204 2.3 + 0.4 2.0 0.2 0.8 1.8 5.9 8.9 
3304 3.8 + 0.5 2.5 1.3 1.8 3.0 8.4 12.5 
3404 7.5 +0.5 2.5 4.1 5.4 6.9 10.1 16.4 
3504 19.2 + 0.5 2.5 12.7 16.0 19.3 22.2 24.3 


of the network. This is corroborated by the observation that digital 
channel banks of the D3 and D4 type, which provide the interface 
between the analog voice-frequency signals and the 1.544-Mb/s digital 
bit stream of the T-carrier system, have a frequency response similar 
to that of the current survey, shown in Fig. 6. 

The gain slope at 404 or 2804 Hz is defined as the loss at that 
frequency minus the loss at 1004 Hz. Figure 7 plots the CDFs of the 
larger of the gain slopes (per connection) at 404 and 2804 Hz for short, 
medium, and long connections. Figures 8 through 10 show the CDFs 
of loss difference for the frequency pairs of 2804 and 604 Hz, 2404 and 
804 Hz, and 2104 and 1304 Hz, respectively. As the figures show, there 
is little mileage dependence for these loss differences except for a 
tendency for short connections (possibly on VF cables) to have some- 
what higher loss difference. 
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Table ViII—Attenuation distortion relative to 1004 Hz: long 
connections 


u. 
Frequency Standard a 

in Hz Mean Deviation 1% 10% 50% 90% 99% 

204 5.9 + 0.4 2.8 2.2 3.1 5.0 10.1 14.1 

254 3.7 + 0.3 2.0 0.7 1.7 3.2 6.6 10.4 

304 2.1+0.2 1.5 —0.2 0.5 1.8 4,2 7.0 

404 1.4+0.1 0.9 —0.2 0.4 1.2 2.5 4.3 

504 0.8+0.1 0.5 —0.2 0.2 0.8 1.5 2.5 

604 0.5 + 0.0 0.4 —0.2 0.0 0.5 1.0 1.6 

704 0.4 + 0.0 0.3 —0.3 0.0 0.4 0.8 1.3 

804 0.2 + 0.0 0.2 —0.2 0.0 0.2 0.5 0.9 

904 0.1 + 0.0 0.1 —0.2 0.0 0.1 0.2 0.5 
1004 0.0 + 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
1104 0.0 + 0.0 0.2 —0.5 —0.2 0.0 0.1 0.3 
1204 —0.1 + 0.0 0.2 —0.7 —0.3 -0.1 0.1 0.4 
1304 —0.1 + 0.0 0.2 —0.8 —0.4 -0.1 0.1 0.4 
1404 —0.1 + 0.0 0.4 -0.9 —0.4 —0.1 0.1 0.4 
1504 —0.1 + 0.0 0.4 —1.0 —0.5 —-0.1 0.1 0.5 
1604 —0.1 + 0.0 0.4 —1.1 —0.5 -0.1 0.2 0.6 
1704 —0.1 + 0.0 0.4 —-1.2 —0.5 0.0 0.3 0.7 
1804 0.0 + 0.0 0.5 —1.2 —0.5 0.0 0.3 0.8 
1904 —0.1 + 0.0 0.5 —1.2 -0.5 -0.1 0.4 0.9 
2004 —0.1 + 0.0 0.5 -1.3 —0.6 —0.1 0.4 1.1 
2104 -0.1+0.1 0.6 -1.3 —0.6 —0.1 0.4 1.3 
2204 0.0 + 0.1 0.6 —1.2 —0.6 —0.1 0.5 1.6 
2304 0.1+0.1 0.7 —1.1 —0.5 0.0 0.7 1.8 
2404 0.3 + 0.1 0.8 —1.0 —0.4 0.2 1.0 2.5 
2804 0.8 + 0.1 1.1 -0.8 —0.1 0.6 1.9 4.7 
2904 0.9 + 0.2 1.2 —0.8 —0.2 0.6 2.3 5.0 
3004 1.0 + 0.2 1.3 —0.7 —0.1 0.7 2.6 5.7 
3104 14+0.2 1.6 —0.6 0.1 1.0 3.2 TA 
3204 2.2 + 0.3 1.9 0.0 0.6 1.7 4.2 10.2 
3304 3.6 + 0.4 2.4 1.0 1.6 2.9 6.2 13.4 
3404 7.5 + 0.3 2.1 4.2 5.4 6.9 10.4 14.3 
3504 19.2 + 0.6 2.8 9.5 16.4 19.5 22.0 25.4 


Figure 11 shows the same type of information that is in Fig. 6 for 
customer-premises-to-customer-premises connections, obtained by 
concatenating the frequency response of the loops to the frequency 
response of the EOCS connections. The concatenation was done, using 
the 1980 Loop Survey data, using the same technique described in the 
previous section. The mean frequency response of the loops (two per 
connection) used in the concatenation is also shown in Fig. 11. Loop 
effects clearly dominate trunk effects in end-to-end frequency re- 
sponse. 


3.3 Envelope delay distortion 


Envelope delay is defined as the negative of the derivative of the 
phase of the received signal with respect to frequency. Envelope Delay 
Distortion (EDD) at a given frequency is defined as the difference 
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Fig. 7—CDFs of the maxium of the two gain slopes at 404 and 2804 Hz relative to 
1004 Hz for short, medium, and long mileage bands. 
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Fig. 8—CDFs of loss at 2804 Hz minus loss at 604 Hz for short, medium, and long 
mileage bands. 
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Fig. 9—CDFs of loss at 2404 Hz minus loss at 804 Hz for short, medium, and long 
mileage bands. 
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Fig. 10—CDFs of loss at 2104 Hz minus loss at 1304 Hz for short, medium, and long 
mileage bands. 
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Fig. 11—Mean customer-premises-to-customer-premises attenuation distortion rel- 
ative to 1004 Hz obtained by analytically concatenating the EOCS result and the 1980 
Loop Survey result. The 1980 Loop Survey result is also shown. 
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Fig. 12—Mean EDD for short, medium, and long mileage bands. 
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between the envelope delay at that frequency and the envelope delay 
at the reference frequency (usually between 1600 and 1800 Hz), where 
the delay is near the minimum value. The reference frequency used in 
the EOCS was 1704 Hz. 

The test signal used for envelope delay distortion measurement in 
the United States, which is also used in the EOCS, is a voiceband 
carrier frequency amplitude modulated (50 percent) by an 83-1/3 Hz 
tone. A return reference path is required to establish the phase 
reference so that the phase of the transmitted 83-1/3 Hz envelope can 
be compared to the phase of the received 83-1/3 Hz envelope. Since 
the frequency aperture of the modulated test signal remains fixed at 
twice the 83-1/3 Hz as the carrier frequency of the modulated test 
signal is varied, the recovered phase changes give estimates of the 
slope of the phase-versus-frequency curve (EDD). The more desirable 
direct measurement of phase versus frequency is not made because of 
the possibility of frequency shift on the facility. 

The effect of a nonlinear phase-versus-frequency characteristic (as 
measured by EDD) on a data signal is such that the different frequency 
components of the signal have different transit times, which results 
in distortion in the received signal. The effect of EDD on data signals 
can be compensated by employing equalizers in the data set receiver. 
The effectiveness of an equalizer depends on the equalization scheme 
used—fixed or adaptive—and the complexity (e.g., number of taps) in 
the equalizer. 

Tables IX through XI show the means, standard deviations, and 
selected percentiles of EDD versus frequency for the short, medium, 
and long connections. As in the 1969/70 Connection Survey, 1704 Hz 
was selected as the reference frequency for EDD measurements in the 
EOCS. Figure 12 shows the mean EDD versus frequency for the short, 
medium, and long mileage categories. Although not shown in Fig. 12, 
the mean EDD for the 1969/70 Connection Survey is practically 
coincident with the short mileage category. 

Figure 13 shows the CDFs of the larger of the EDD values at 604 
and 2804 Hz for the short, medium, and long mileage categories. Figure 
14 is a scatter plot of EDD at 604 Hz compared to that at 2804 Hz, 
which shows that there is little dependence (correlation coefficient of 
0.53) between the EDDs at the two frequencies. Figures 15 and 16 
show that there is even less correlation between EDD and loss at these 
frequencies (correlation coefficient of 0.32 at 604 Hz and 0.37 at 2804 
Hz). 


3.4 Peak-to-average ratio 


Peak-to-Average Ratio (P/AR) measurements are made on a 
straightaway basis with a transmitter and a receiver attached at 
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Table IX—Envelope delay distortion relative to 1704 Hz (all statistics 
expressed in ys): short connections 


uantiles 
Frequency Standard es ee 

in Hz Mean Deviation 1% 25% 50% 75% 99% 
304 3580 1221 1103 3029 3324 4226 6725 
404 1959 684 492 1695 1876 2185 3885 
504 1242 460 108 1114 1209 1365 2492 
604 839 334 15 748 819 922 1749 
104 577 258 —71 506 556 637 1278 
804 409 190 —35 356 398 456 922 
904 298 148 —-71 254 293 333 700 
1004 205 121 -137 161 201 239 526 
1104 136 92 —125 95 128 165 375 
1204 87 719 —-121 44 T7 111 273 
1304 54 53 —67 20 48 76 199 
1404 36 47 —-119 12 32 52 145 
1504 20 37 —150 5 19 32 91 
1604 7 36 —85 —1 7 14 66 
1704 0 0 0 0 0 0 0 
1804 6 32 —78 —2 4 11 69 
1904 21 32 —92 12 21 29 105 
2004 50 39 —58 35 49 61 146 
2104 82 47 —39 63 84 98 216 
2204 119 57 —27 92 122 142 259 
2304 166 65 8 136 165 197 329 
2404 220 88 36 177 212 261 493 
2804 609 217 102 541 584 727 1021 
2904 787 248 136 704 766 909 1243 
3004 1017 312 159 949 1008 1176 1567 
3104 1312 405 198 1248 1307 1466 1967 
3204 1734 551 226 1651 1744 1951 2606 
3304 2410 7172 253 2304 2470 2625 3739 
3404 3132 1102 295 3062 3256 3443 5189 


opposite ends of a connection. The transmitter generates a precisely 
controlled complex waveform of known peak-to-average ratio. The 
energy in the waveform is dispersed in time by the bandwidth reduction 
and envelope delay distortion encountered on the connection in a way 
that may be directly related to intersymbol interference (eye closing) 
of data signals.? The P/AR receiver measures the peak and full-wave 
average values of the waveform and displays their ratio on a zero- 
suppressed scale. A P/AR value of 100 suggests no pulse degradation. 

The P/AR signal is largely insensitive to noise, phase jitter, and 
intermodulation distortion, and is unaffected by frequency shift or 
transient phenomena. P/AR does not produce unambiguous diagnostic 
information, so there are no externally published requirements for 
P/AR. Since P/AR ignores transients, P/AR readings cannot predict 
data set error performance on connections where the transients dom- 
inate data set performance. P/AR values may be the same for different 
EDD shapes occurring on real connections, and with the addition of 
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Table X—Envelope delay distortion relative to 1704 Hz (all statistics 
expressed in us): medium connections 


Frequency Standard eos fe = 

in Hz Mean Deviation 1% 25% 50% 75% 99% 
304 4290 1292 2787 3261 3755 4951 7819 
404 2497 936 1579 1830 2101 2959 5465 
504 1633 631 1016 1191 1340 2009 3706 
604 1129 457 661 809 912 1403 2654 
704 794 339 418 557 638 977 1904 
804 574 249 275 401 456 7135 1371 
904 427 189 190 299 339 553 1034 
1004 306 152 109 207 240 402 775 
1104 208 112 34 132 161 279 543 
1204 135 97 -18 717 106 183 378 
1304 86 69 —40 46 70 122 265 
1404 61 718 -—31 30 48 83 182 
1504 37 56 —29 18 30 50 115 
1604 16 46 -31 6 12 22 55 
1704 0 0 0 0 0 0 0 
1804 1 71 —45 —9 -1 7 35 
1904 15 27 —53 3 16 26 719 
2004 45 46 —45 26 44 59 153 
2104 85 65 —33 59 80 98 252 
2204 126 T7 -17 93 115 143 353 
2304 175 109 3 130 155 197 458 
2404 235 153 24 174 204 261 585 
2804 692 265 264 557 596 768 1588 
2904 888 308 383 723 772 977 1950 
3004 1156 377 551 955 1007 1261 2437 
3104 1523 481 157 1268 1321 1646 3175 
3204 2048 638 1079 1697 1784 2178 4242 
3304 2828 843 1694 2352 2507 2964 5558 
3404 3585 877 2277 3093 3285 3724 6320 


sophisticated EDD adaptive equalizers in data sets, the utility of 
P/AR has diminished. P/AR serves as a quick straightaway measure 
of the relative bandwidth reduction and EDD on a significant per- 
centage of connections. 

Figure 17 shows that there is almost no relationship between P/AR 
and connection mileage except that there are almost no short connec- 
tions with a P/AR rating below 70. The figure also shows rare P/AR 
values above 100, which usually suggest connections where the loss at 
the band edges is less than that at the center of the band. Such 
connections increase the peak value of the received P/AR signal 
relative to the average value. Such a connection in tandem with a 
“normal” connection (which has higher loss at the band edges) will 
improve the P/AR value over that of the normal connection alone. 

Figures 18 through 20 are scatter plots of P/AR versus the maximum 
of EDDs at 604 and 2804 Hz, which are test frequencies for network 
performance objectives. These frequencies have no special relationship 
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Table XI—Envelope delay distortion relative to 1704 Hz (all statistics 
expressed in us): long connections 





i 
Frequency Standard qusaules 

in Hz Mean Deviation 1% 25% 50% 75% 99% 
304 5032 1427 2928 3616 4881 6055 8194 
404 2997 1245 1633 2028 2398 3624 6714 
504 1959 881 1031 1280 1498 2356 4534 
604 1366 643 680 875 1035 1640 3222 
704 966 469 438 610 729 1165 2292 
804 698 350 280 435 521 853 1685 
904 515 266 184 317 386 641 1255 
1004 371 201 105 224 275 470 921 
1104 258 146 53 153 191 328 659 
1204 173 112 16 101 133 217 459 
1304 111 15 —12 64 89 143 305 
1404 73 72 —26 40 57 96 216 
1504 42 62 —38 20 32 56 134 
1604 16 29 ~31 4 12 23 66 
1704 0 0 0 0 0 0 0 
1804 2 28 —49 —7 2 9 39 
1904 20 29 —64 6 21 31 80 
2004 51 46 —65 32 51 66 141 
2104 90 59 —72 67 88 108 219 
2204 132 73 —66 102 125 154 304 
2304 181 115 —54 140 166 206 407 
2404 239 129 —16 186 214 275 515 
2804 726 275 285 567 604 807 1573 
2904 954 347 412 745 797 1051 2073 
3004 1258 452 588 976 1039 1393 2765 
3104 1659 586 836 1277 1374 1857 3657 
3204 2224 761 1182 1721 1852 2509 4788 
3304 3105 989 1750 2457 2623 3502 6249 
3404 3912 1018 2260 3247 3447 4496 6804 


to the P/AR signal. These figures show that there is a reasonably 
strong correlation (particularly for longer connection mileages) be- 
tween EDD at 604 or 2804 Hz (whichever is higher) and P/AR, in 
that when EDD* is higher, P/AR is lower. The two straight lines 
(labeled 1 and 2) in these figures are the regression lines of P/AR 
versus EDD and EDD versus P/AR. The degree to which line 1 differs 
from line 2 is directly related to the coefficient of correlation: the two 
lines would coincide if P/AR and EDD were perfectly correlated. 

The ellipses of concentration shown in these figures have five 
parameters: two of them determine the center position; one, the 
angular orientation; and two, the lengths of the major/minor axes. 
These parameters were determined so that a uniform, elliptical mass 
of data points would have the same means, standard deviations, and 


* EDD in the remainder of this subsection refers to the maximum (per connection) 
of EDDs at 604 and 2804 Hz. 
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Fig. 13—CDFs of the maximum of the two EDDs at 604 and 2804 Hz relative to 1704 
Hz for short, medium, and long mileage bands. 
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Fig. 14—EDD (relative to 1704 Hz) at 2804 Hz versus EDD (relative to 1704 Hz) at 
604 Hz. 
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Fig. 15—EDD versus attenuation distortion at 604 Hz, both relative to 1704 Hz. 
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Fig. 16—EDD versus attenuation distortion at 2804 Hz, both relative to 1704 Hz. 
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Fig. 17—CDFs of P/AR for short, medium, and long mileage bands. 


correlation coefficient as the original data. The ellipse of concentration 
has two horizontal and two vertical tangents passing through the 
points where, respectively, line 1 and line 2 intersect the ellipse. The 
distance between the two horizontal tangents is equal to four times 
the standard deviation of P/AR. Similarly, the distance between the 
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Fig. 18—P/AR versus the maximum of the EDDs at 604 and 2804 Hz for the short 
mileage band (563 measurements). Eighty-six percent of the points fall within the ellipse 
and are not plotted. 
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Fig. 19—P/AR versus the maximum of the EDDs at 604 and 2804 Hz for the medium 
mileage band (2586 measurements). Ninety percent of the points fall within the ellipse 
and are not plotted. 
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Fig. 20—P/AR versus the maximum of the EDDs at 604 and 2804 Hz for the long 
mileage band (3218 measurements). Ninety percent of the points fall within the ellipse 
and are not plotted. 
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two vertical tangents is equal to four times the standard deviation of 
EDD. Almost 90 percent of the measurements fall within the ellipses. 


3.5 Propagation delay 


Round-trip propagation delay measurements were made in the 
EOCS by using full-duplex 1200-b/s data sets (to be described in 3.13) 
under the control of the microprocessor in the ASPEN RTU. An error 
was introduced in a continuous repetition of a 511-bit pseudorandom 
word transmitted by the near-end RTU microprocessor bit error rate 
generator to the low-band modulator of the full-duplex 1200-b/s data 
set. When the far-end RTU microprocessor recognized the error from 
the low-band demodulator of its 1200-b/s data set, it immediately 
introduced an error in the continuous repetition of the same 511-bit 
pseudorandom word being transmitted back to the near end by the 
high-band 1200-b/s data set modulator. When the near-end RTU 
microprocessor recognized the forced error, it corrected for the known 
(fixed) processing delay to get a first estimate of round-trip propaga- 
tion delay. This sequence was repeated nine more times to obtain 
enough valid measurements to reject those affected by random errors 
on the connection. 

Figure 21 shows boxplots of round-trip propagation delay versus 
mileage (with the same abscissa as that of Fig. 1), except for the 52 
measurements taken on satellite connections.* Round-trip delay mea- 
surements on these satellite connections ranged from 508 to 539 ms. 
The variability in round-trip delay for the same end office building 
pairs can be attributed to alternate trunk facilities as well as alternate 
routing. Figure 22 shows CDFs of round-trip delay for the three 
mileage bands. In Figs. 21 and 22, measurements on satellite connec- 
tions were excluded from these CDFs with the 52 measurements on 
satellites excluded. The boxplots and the CDFs show a strong depend- 
ence of round-trip delay on airline mileage. 


3.6 Message circuit noise and signal-to-C-notched noise ratio 


Message circuit noise was measured both with and without a 1004- 
Hz holding tone, with both the C-message and 3-kHz flat weighting 
filters, as Ref. 1 specifies. The C-message filter weighting characteristic 
was derived in 1957 from tests made with subjects assessing the 
interfering effects of single frequency interference as heard over an 
ordinary telephone handset. This weighting is also appropriate for 
high-speed data transmission because most high-speed modems con- 
centrate their transmitted energy in approximately the same band of 
sensitivity as the C-message filter. The ac power-line hum at 60 Hz 


* Propagation delay is the only parameter in this paper for which measurements on 
satellite connections are treated separately. 
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Fig. 22—The cumulative distributions of round-trip propagation delay. 
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(and odd harmonics of 60 Hz), frequently encountered in the loop 
plant, is attenuated by the C-message filter but is included in the noise 
measurements made with the 3-kHz flat filter. All noise measurements 
were reported as they were measured, without corrections for losses in 
the office in which the measurements were made, which permits direct 
comparison with the 1969/70 Connection Survey,* and concatenation 
with the loop plant.® 

If a connection has facilities with digital channel banks or compan- 
dors, noise on that connection can be substantially different, depend- 
ing on whether it is measured with or without a holding tone. For data 
transmission, therefore, C-notched noise is more relevant than C- 
message noise. C-notched noise is obtained by filtering out the 1004- 
Hz holding tone with a deep (50-dB) notch filter and then measuring 
C-message noise. Signal-to-C-notched-noise ratio (s/n) is the ratio of 
the received 1004-Hz holding tone power to the C-notched noise power. 

The s/n is an essential performance measure for digital channel 
banks. As Fig. 23 shows, digital channel banks have an approximately 
logarithmic (4255) encoder/decoder that maintains a nearly constant 
s/n over a reasonable range of signal levels. Because most of the 
connections measured in the EOCS have at least one T-carrier link, 
the s/n results from the EOCS are dominated by this characteristic. 
This can be observed in the figures that follow. 

Figure 24 shows the CDFs of s/n for the three mileage bands. The 
CDFs confirm mileage dependence of s/n, particularly in the region of 
“good” s/n or the upper tails. Comparison of these CDFs with similar 
CDFs obtained from the 1969/70 Connection Survey shows that the 
variability of s/n on short connections has been reduced substantially 
since the last survey. This tighter s/n distribution for short connec- 
tions in the EOCS is caused by the introduction of T-carrier systems 
in the network since the 1969/70 Connection Survey. The same 
comparison suggests that the percentage of short connections with 
s/n better than 40 dB—the s/n ceiling for digital channel banks 
observed in Fig. 23—-was greater in the 1969/70 Connection Survey 
than in the EOCS. However, this s/n degradation, which can also be 
attributed to T-carrier facilities, is largely inconsequential to the 
performance of data sets in the region where it occurs, i.e., the upper 
tail of the CDF. Listeners are usually unable to discern differences in 
s/n ratios above 40 dB. 

Figure 25 shows the Probability Density Functions (PDFs) of s/n 
for the three mileage bands. The PDF for the short mileage category 
shows bimodality, whereas the PDFs for the medium and long mileage 
categories are unimodal. The two peaks for the short category occur 
at 34 and 38 dB. The peak at 38 dB is consistent with the noise 
expected on a connection with one T-carrier system with one digital 
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Fig. 23—Signal-to-distortion performance of 8-bit u=255 coder-decoder (15-segment 
approximation, 8 segments +, 8 —). 
channel bank at each end—one analog-to-digital (A/D) and one digi- 
tal-to-analog (D/A) conversion, i.e., an end-to-end digital connection 
or an analog connection with exactly one digital facility or digital 
switch; see Fig. 23. The lower peak is consistent with occurrences of 
two digital links in tandem. 

Figure 26 shows a scatter plot of C-notched noise versus C-message 
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Fig. 24—-CDFs of signal-to-C-notched-noise ratio for short, medium, and long mileage 
bands. 
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Fig. 26—C-notched noise versus C-message noise. 
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noise. Also shown in the figure is a straight line on which C-notched 
noise is equal to C-message noise. The large cluster of points above 
the line, corresponding to the connections with higher C-notched noise 
than C-message noise, show the effects of compandors, quantizing 
noise, and harmonic distortion on the holding tone. The points scat- 
tered far above the line may correspond to the connections where bad 
coders were encountered or where the C-notched noise measurements 
were affected by impulse noise. The points along the line represent 
the connections where the tone had no effect on noise. Small deviations 
from the line can be expected, considering possible time variation of 
noise between the two types of noise measurement. The points scat- 
tered far below the line indicate the possible effects of impulse noise 
during the C-message noise measurements. 

Figure 27 shows the CDFs of C-message noise for the three mileage 
bands. The mileage dependence of C-message noise can be observed 
in the figure, as would be expected for the analog carrier facilities 
normally encountered on longer trunks. The airline-mileage effect on 
the C-message noise can also be seen on the boxplots of Fig. 28 (see 
Fig. 2 for the explanation of the abscissa). 

Figure 29 presents the CDFs of C-notched noise for the three mileage 
bands. Mileage dependence is less apparent with C-notched noise than 
with C-message noise. In particular, Fig. 29 shows virtually no differ- 
ence in the CDF of C-notched noise between the medium and long 
mileage categories. It appears that most connections measured for 
these mileage categories had T-carrier facilities on the toll-connecting 
trunks, one at each end, which dominated the connection C-notched 
noise. The dominance of the toll connecting trunk noise reduces the 
dependence of noise on connection mileage for the longer connections. 
Digital switches in tandem with T-carrier links do not degrade the C- 
notched noise. 

It appears that the C-notched noise measurements for the short 
mileage category consist of measurements from two groups of connec- 
tions, as evidenced by the bimodality of the s/n PDF in Fig. 25: one 
group containing two pairs of digital channel banks, and the other 
containing one pair of digital channel banks. As Fig. 29 shows, the 
upper half of the CDF for the short mileage category is almost the 
same as the CDFs for the other two categories, suggesting that it is 
made up of the measurements from the connections with two pairs of 
digital channel banks. The lower half of the CDF for the short mileage 
category, however, is significantly different from those for the other 
two mileage categories, suggesting that the measurements are from 
the connections with one pair of digital channel or from Voice Fre- 
quency (VF) cable. 

Figure 30 shows the CDFs of the 3-kHz flat noise for the three 
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Fig. 27—CDFs of C-message noise for short, medium, and long mileage bands. 
mileage bands. As we can see in the figure, the 3-kHz flat noise shows 
almost no dependence on mileage and is much higher than the C- 


message noise. This shows that this type of noise is dominated by 
sources outside the range of the C-message filter, primarily multiples 
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Fig. 28—-C-message noise versus airline mileage in 100-mile blocks. 
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Fig. 29—CDFs of C-notched noise for short, medium, and long mileage bands. 


of 60 Hz from the end office line-circuit battery feed. The same 
remarks made for the 3-kHz flat noise hold for the 3-kHz flat notched 
noise. Figure 31 shows the CDFs of the 3-kHz flat notched noise for 
the three mileage bands. 

Figure 32 shows the CDFs of the 3-kHz flat noise-to-ground per 
environment of the end office: urban, suburban, and rural. An urban 
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Fig. 30—CDFs of 3-kHz flat noise for short, medium, and long mileage bands. 
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Fig. 32—CDFs of 3-kHz flat noise-to-ground in central offices classified as rural, 
suburban, and urban. 


CONNECTION STUDY 2093 


end office was defined in Ref. 8 as one serving more than 20,000 
assigned pairs; a rural end office as one serving fewer than 5000 
assigned pairs; and a suburban end office as one serving between 5000 
and 20,000 assigned pairs. The bottom of the measuring range for the 
test equipment for noise-to-ground was 40 dBrn, so measured values 
below this value were conservatively estimated at 39 dBrn, causing 
the truncation in the CDFs at that value. It should be recalled that no 
loops were connected for EOCS measurements. These measurements 
confirmed that the medians for 3-kHz flat noise-to-ground for trunks 
in the central office with no loops connected were well below the 
median contribution for trunks connected to the loop plant® (by 12- 
dB for urban, 20 dB for suburban, and 40 dB for rural central offices). 

Figure 33 shows the CDF of C-message noise on customer-prem- 
ises-to-customer-premises connections. The calculated customer C- 
message noise power of a connection was obtained by power-summing 
the noise power of the connection measured at the end office atten- 
uated by the loss of the loop, and the C-message noise on the loop 
(customer end). Figure 33 was obtained by analytically concatenating 
(using discrete convolution techniques) the loop noise from the 1980 
Loop Survey to the end office to end office C-messsage noise atten- 
uated by the loop loss. Also included in the same figure for comparison 
is the CDF of the end office to end office C-message noise for the 
EOCS medium connection length category. The distribution of the C- 
message noise can be seen to improve with the addition of the loop. 
The contribution of the loop loss (which attenuates the noise from the 
end office) is apparently more important than the effect of the noise 
on the loop. 

Figure 34 shows the CDF of C-message weighted metallic noise for 
Bell System loops from the 1980 Loop Survey used in the concatena- 
tion for Figure 33. (The CDF of 1004-Hz loss for the Bell System loops 
used in the concatenation appears on Fig. 5.) 


3.7 Intermodulation distortion 


In the past, a harmonic distortion measurement was used to char- 
acterize nonlinearities by applying a single tone to the connection and 
measuring the received power at the second and third harmonics of 
the test frequency with a selective detector. However, this type of 
measurements did not properly characterize nonlinearities as they 
affected data transmission. The harmonics of the sine wave could 
cancel one another on connections with multiple nonlinearities, and 
the PDF of noise-like high-speed data signals is markedly different 
from that of a single tone. 

In the EOCS, nonlinearities were evaluated by an intermodulation 
distortion measurement using the four-tone method.’ The four-tone 
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Fig. 33—CDFs of the customer-premises-to-customer-premises C-message noise for 
short, medium, and long mileage bands. 


method is not subject to the cancellation effect, and the PDF of its 
test signal is a much better approximation to the PDF of a high-speed 
data modem signal than is a sinusoidal (one-tone) test signal. 

The intermodulation distortion measured with the four-tone test 
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noise. To correct for this, the noise component was measured by 
removing two of the four tones and measuring the energy in the narrow 
bands where the four-tone intermodulation distortion would fall. The 
corrected intermodulation distortion was then calculated by power 
subtraction of the noise component from the original measurement, 
as outlined in Ref. 1. (For example, if the power measured with four 
tones was 1 dB larger than that measured with two tones, the true 
intermodulation distortion would be 7 dB below the value measured 
with four tones. If the levels measured with four tones and with two 
tones were the same, the conservative value of 8 dB was subtracted 
from the four-tone intermodulation distortion measurement.) 

Figures 35 and 36 show the CDFs of second- and third-order 
intermodulation distortion for the three mileage bands. The abscissa 
shows intermodulation distortion expressed as a signal-to-distortion 
ratio in decibels, and thus higher values on the abscissa represent 
better performance. As we can see in the figure, there is little depend- 
ence on mileage, particularly for the medium and long mileage cate- 
gories. Intermodulation distortion on connections in these mileage 
categories could have been contributed mostly by central office equip- 
ment, such as multiplexors, whose appearance is only weakly corre- 
lated with mileage. 

The scatter plot of third-order versus second-order intermodulation 
distortion of Fig. 37 shows moderate correlation between the two 
parameters. 


3.8 Phase and amplitude jitter 


Phase jitter is the deviation or “jitter” of zero-crossings of a 1004- 
Hz tone from their nominal position in time. Phase jitter was measured 
by comparing the average phase of the signal (determined by a phase- 
locked loop) and the instantaneous phase of the received signal. The 
normal bandwidth for the measurement of (demodulated) phase jitter 
is 20 to 300 Hz. In the EOCS, phase jitter was measured in two bands: 
20 to 300 Hz and 2 to 300 Hz. The bandwidth of the phase jitter 
detector in the transmission test set used in the EOCS extends below 
the recommended 4-Hz corner of Ref. 1. 

Amplitude jitter is the deviation or “jitter” of the peak of a 1004-Hz 
tone from its nominal value. Amplitude jitter was measured with the 
same two frequency bands as the phase jitter, and the phase and 
amplitude jitter circuits used the same post-detection filter and peak 
detector. 

Figures 38 and 39 show the CDFs of phase jitter for the 20- to 300- 
Hz band and for the 2- to 300-Hz band, respectively, for the three 
mileage bands. The figures show dependence of phase jitter on mileage. 
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Fig. 35—-CDFs of second-order intermodulation distortion for short, medium, and 
long mileage bands. 
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Fig. 36—CDFs of third-order intermodulation distortion for short, medium, and long 
mileage bands. 
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Fig. 37—Second-order versus third-order intermodulation distortion. 
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is Fig. 38—CDFs of 20- to 300-Hz phase jitter for short, medium, and long mileage 
ands. 
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Fig. 39—CDFs of 2- to 300-Hz phase jitter for short, medium, and long mileage 
bands. 


Figures 40 and 41 present the CDF of amplitude jitter for the three 
mileage categories, for the two frequency bands, respectively. 

Phase jitter can be caused by phase modulation as well as by noise. 
Measurement of phase jitter is appropriate to predict high-speed data 
set performance, but only as an indirect measure of phase modulation. 
However, the phase jitter measuring set alone cannot distinguish phase 
jitter caused by noise from that caused by phase modulation. The 
primary purpose of the amplitude jitter measurement is to separate 
phase jitter caused by the two sources. 

Although both noise and amplitude modulation can cause amplitude 
jitter, a signal is unlikely to encounter amplitude modulation sources 
in the network, leaving noise as the sole source of amplitude jitter. On 
the other hand, the network has both sources for phase jitter, namely, 
noise and phase modulation. Therefore, the amplitude jitter measure- 
ments can be compared with phase jitter measurements to distinguish 
phase jitter caused by the two sources. For example, a high phase jitter 
measurement accompanied by a low amplitude jitter measurement on 
a connection is an indication that the phase jitter on that connection 
is not caused by noise, but most likely is caused by phase modulation. 

Figure 42 is a scatter plot of the 20- to 300-Hz amplitude jitter 
versus the 20- to 300-Hz phase jitter. Also shown in the figure are two 
demarcation lines (labeled A and B) experimentally obtained by testing 
the phase and amplitude jitter measurement equipment used in the 
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Fig. 41—CDFs of 2- to 300-Hz amplitude jitter for short, medium, and long mileage 
bands. 
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Fig. 42—Amplitude versus phase jitter in the 20- to 300-Hz band. 


EOCS. Points between lines A and B indicate connections where 
phase and amplitude jitter show good correlation. Therefore, these 
points almost all correspond to phase jitter solely caused by noise. 
Points above line A show connections with high amplitude jitter but 
little phase jitter, and they are attributed to the effect of impulse noise 
on the amplitude jitter detector. (The requirements for all jitter 
detectors mandate peak detectors, which also respond to momentary 
increases from impulse noise.) Points below line B, showing connec- 
tions with high phase jitter but low amplitude jitter, suggest that phase 
jitter on those connections is largely caused by phase modulation, and 
is occasionally caused by impulse noise. 


3.9 Frequency shift 


Frequency shift, or absolute frequency offset, is a critical parameter, 
for example, for the proper functioning of echo cancelers. Echo can- 
celers continue to adapt during voice calls and can track frequency 
shifts below 1 Hz. Depending on the magnitude of the frequency shift, 
brief transient echoes may be heard after conversational pauses, since 
adaptation only occurs when just one party is talking. Since echo 
cancelers freeze on calls where a continuous data set signal is present, 
frequency shift will cause echoes whose magnitudes change at the 
frequency shift rate. 
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Frequency shift was measured in the 1969/70 Connection Survey by 
transmitting at -12 dBm a 1200-Hz tone whose frequency was known 
to 0.1 Hz. The frequency of the received tone was measured to a 
precision of 0.1 Hz at the far end of the connection leading to an 
overall accuracy of approximately 0.2 Hz. The difference between the 
two frequencies was the frequency shift of the connection. The mea- 
sured frequency shifts were not normally distributed. An offset of 
greater than 3 Hz was observed on two of the 600 measurements. 

In the EOCS, the transmitted frequencies were known to a precision 
of 0.1 Hz. The received frequencies were measured to a resolution of 
1 Hz with a frequency counter that could be momentarily driven 
upward by impulse noise, and could momentarily be driven downward 
by transient power line harmonics. Connections with poor s/n caused 
a positive 1-Hz offset in the frequency counter output. Taking only 
those connections for which multiple, stable frequency shifts were 
observed: 

1. A positive frequency shift of 2 Hz was observed on eight of 4222 
connections (0.19 percent). 

2. A negative frequency shift of 1 Hz was observed on ten of 4222 
connections (0.24 percent). 

All but one of the stable frequency shifts observed were for connec- 
tions to a single end office that had N3 (frequency division multiplex) 
carrier toll-connecting trunks. The precision of the frequency mea- 
surements in the EOCS is not sufficient to draw conclusions about 
the performance of the network for frequency shift, particularly since 
the observed shifts were associated with the toll-connecting trunks for 
a single office. 


3.10 Impulse noise 


Impulse noise was measured through a C-notched filter by counting 
the number of times the noise exceeded a given threshold. The impulse 
noise measurement consisted of three five-minute measurements in 
sequence, with three different thresholds for each five-minute interval. 

The thresholds for impulse noise counts were set based on the 
received rms level of the 1004-Hz holding tone. For the first five 
minutes, thresholds were set at —12, —8, and —4 dB relative to the 
received holding tone level; for the second five minutes, at —12, —4, 
and +4 dB; for the last five minutes, at —8, 0, and +8 dB. Since 
measurements at the —12, —8, and —4 dB thresholds were made twice 
for five minutes on each connection, while measurements at the 0, +4, 
and +8 dB thresholds were made only once for five minutes per 
connection, all figures and results in this section are based on twice 
as many observations for the —12, —8, and —4 dB thresholds than for 
the other three higher thresholds. 
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Impulse noise counters with multiple thresholds have the charac- 
teristic that a single impulse exceeding the highest threshold must 
also register on all the lower thresholds. This means that, in any given 
five-minute interval, the counter for the lowest threshold will have the 
same as or higher count than the counter for a higher threshold. As 
one would expect, the CDF of impulse noise counts at the —12 dB 
threshold falls to the right of the CDF at the —8 dB threshold. Since 
the thresholds were changed for each of the three five-minute transient 
measuring intervals, it is possile (but unlikely) that on any given 
connection, there could be a smaller count for the 0-dB threshold than 
for the 4-dB threshold, for example. 

Statistically, the effect of connection airline mileage was found not 
to be significant on impulse noise count. The type of switch at either 
end of the connection was found to be a significant factor. Figures 43 
through 45 show the CDFs of impulse noise counts per five-minute 
interval at various thresholds for different switch types (digital, cross- 
bar, and step-by-step switches) at the measuring end office. The type 
of switch in the end office from which the tone was sent was taken 
into consideration through a weighting process based on predivestiture 
Bell System traffic statistics. A connection with a digital switch at the 
measuring end is expected to have fewer impulse noise counts than a 
connection with an electromechanical switch (crossbar, or step-by- 
step) in which the operation and release of adjacent relays can some- 
times cause impulse noise. 

The limiting bend in the CDF for the —12 dB threshold at approx- 
imately 2000 counts per five-minute interval in Fig. 45 might have 
been caused by power-line-hum pickup in the small rural step-by-step 
offices, which caused continuous impulse counts at the maximum 
counting rate of 420 counts per minute. 

Figures 46 and 47 show the effect of switch type at both ends of the 
connection. Three types of switch were evaluated in the EOCS, leading 
to six nonordered pairs of switches when both end switches were taken 
into consideration. Figure 46 shows the percentage of impulse noise 
counts per five-minute interval for the —12 dB threshold for each of 
six different pairs of switches. The bars are divided, reading from the 
bottom up, as 0 counts, 1 to 10 counts, 11 to 20 counts, 21 to 50 counts, 
and more than 50 counts. The digital switch performs better than the 
crossbar, which is superior to the step-by-step, and this holds for either 
end of the connection. Figure 47 is similar to Fig. 46 except for the —4 
dB threshold. 

Figure 48 shows the impulse noise counts per five-minute interval 
for the —12 dB threshold at the step-by-step measuring end office 
switch plotted as a function of the time of day. It indicates an increase 
in the impulse noise counts in step-by-step offices during the busy 


CONNECTION STUDY = 2103 


8-dB THRESHOLD ——— 
4-dB THRESHOLD ——— 
0-dB THRESHOLD 
-4 dB THRESHOLD 
-8 dB THRESHOLD =. 
-12 dB THRESHOLD ——— 


PERCENT WITH VALUE LESS THAN ABSCISSA 





0 1 2 5 10 50 100 500 2000 
IMPULSE COUNTS IN FIVE MINUTES 


Fig. 43—-CDFs of the impulse noise counts for the different thresholds as measured 
at electronic switching system end offices. 


hours of a day, as one would expect when step-by-step switches in 
adjacent bays operate and release as other customers start and finish 
calls. A similar effect, with a smaller amplitude, was observed for the 
other thresholds and for the other types of switch. 
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Fig. 44—-CDFs of the impulse noise counts for the different thresholds as measured 
at crossbar end offices. 
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Fig. 45—CDFs of the impulse noise counts for the different thresholds as measured 
at step-by-step end offices. 
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Fig. 46—Impulse noise counts for five-minute interval for the threshold of 12 dB 
below the received signal and for the six different pairs of end offices. 
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Fig. 47—Impulse noise counts per five-minute interval for the threshold of 4 dB 
below the received signal and for the six different pairs of end offices. 
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Fig. 48—Impulse noise counts for —12 dB threshold measured at step-by-step 
switches shown as a function of time of day. 


3.11 Phase and gain hits 

A phase hit is an abrupt change in the nominal phase of the received 
1004-Hz holding tone lasting at least 4 ms. A gain hit is an abrupt 
change in the nominal level of the received 1004-Hz holding tone 
lasting at least 4 ms. The precision of the phase- and gain-hit mea- 
surement, the tracking rate for the phase-locked loop for the phase- 
hit counter, and the rate of change for the automatic gain control for 
the gain-hit counter are given in Ref. 1. 

Phase and gain hits were measured simultaneously with impulse 
noise during the three five-minute transient measurement periods 
discussed in the previous section. During the first five minutes, the 
phase-hit threshold was set at 10 degrees and the gain hit threshold 
was set at 2 dB; during the second five minutes, the phase- and gain- 
hit thresholds were set at 15 degrees and 3 dB, respectively; finally, 
during the last five minutes, the phase- and gain-hit thresholds were 
set at 20 degrees and 6 dB, respectively. The phase- and gain-hit 
counting circuits respond to both positive and negative hits. 

Figure 49 shows the CDFs of phase-hit counts per five-minute 
interval for the thresholds of 10, 15, and 20 degrees. The figure shows 
little difference between the CDFs corresponding to the two higher 
thresholds, 15 and 20 degrees. However, the CDF with 10-degree 
threshold is clearly worse than the other two CDFs. This may be 
caused by low-frequency phase modulation which can trigger the 
phase-hit counter at the 10-degree threshold. 

Figure 50 shows the CDFs of gain-hit counts per five-minute interval 
for the thresholds of 2, 3, and 6 dB. There is a distinct reduction in 
the counts as the threshold is increased. 


3.12 Dropouts 

A dropout is defined as a 12-dB reduction of received signal level, 
as measured at the start of the 15-minute transient measurement 
interval, lasting for at least 4 ms. The dropout counter circuit has no 
automatic gain control circuit in contrast with the gain-hit counter. 
Figure 51 shows the CDF of dropout counts per 15-minute interval. 


3.13 Bit and block error rates 


The error rate performance of two widely used 1200-b/s data sets 
and two widely used 4800-b/s data sets was measured on the same 
connections on which the analog transmission impairments were 
measured. To simulate the environment in which the data sets would 
normally be operating, an artificial loop was placed in front of the 
data set in each ASPEN RTU. The artificial loop simulates 6000 feet 
of 26-gauge cable in series with 6000 feet of 24-gauge cable to achieve 
the mean loop loss of 5.3 dB determined from the 1980 Loop Survey. 
The data signal level at the data set was approximately —9 dBm. 
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Fig. 49—CDFs of the phase hit counts for the thresholds of 10, 15, and 20 degrees. 


The two 1200-b/s data sets used in the EOCS were full-duplex, four- 
phase, Differential Phase-Shift Keyed (DPSK) data sets. The data 
sets transmitted 1200-b/s (600-baud) synchronous binary serial data 
simultaneously in both directions by splitting the voiceband into a low 
band (carrier frequency of 1200 Hz) and high band (carrier frequency 
of 2400 Hz) of frequencies. The energy in the low-band line signal 
extended from 720 to 1680 Hz, and the high band, from 1920 to 2880 
Hz. These data sets employed scramblers to prevent steady marking 
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Fig. 5|0—CDFs of the gain-hit counts for the thresholds of 2, 3, and 6 dB. 
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or spacing to the transmitter causing a continuous stream of zero- 
degree phase shifts on the line (which would block the receiver timing- 
recovery circuit reference phase extraction from the incoming signal). 
These data sets had no adaptive equalizers to mitigate the effects of 
bandwidth reduction or poor EDD on the connection. 

The two 4800-b/s data sets used in the EOCS were half-duplex, 
eight-phase DPSK data sets. They transmitted a 1600-baud (3 bits 
per symbol) signal, and 23-stage, multiple-tap scramblers were em- 
ployed for the same reason as the scramblers in the 1200-b/s data sets. 
Most of the energy in the line spectrum was between 800 and 2800 
Hz, and neither data set had a secondary channel. The receivers had 
adaptive equalizers to reduce the effects of connection bandwidth 
reduction and EDD on intersymbol interference. 

The solid and dotted curves of Figs. 52, 53, and 54 show the CDFs 
of the bit error rates for the two 1200-b/s data sets for short, medium, 
and long connections. For these three figures, the low and high-band 
error rates were combined. The 1200-b/s data set bit error performance 
is poorer for longer connections. In Fig. 55, the bit error rates for long 
connections were separated into low and high band for the two 1200- 
b/s data sets. This figure demonstrates how the selection of the 
compromise equalizer in the data set can affect the relative perform- 
ance of the two bands. 

Since about one million bits were transmitted, there are no data 
points on the CDFs between “No Errors” and 1 x 10~° bit error rate. 
The bit error rate counter in the ASPEN RTU attempted to resyn- 
chronize when 99 errors were received, and therefore the bit error rate 
plots were truncated at the value corresponding to 99 errors. 

In the middle of the data collection period, the ASPEN RTUs were 
modified to permit both bit and block error rate measurement. For 
block error rate measurements, one thousand 1000-bit blocks were 
transmitted, and the restriction of no more than 99 errors in a single 
block was removed. All bit error rate figures include data collected 
over the entire EOCS data collection period, while the block error rate 
figures come from only the later part of the collection period. The 
CDFs for the block error rate performance for the two 1200-b/s data 
sets for short, medium, and long connections are shown in Figs. 56, 
57, and 58. The block error rate performance of the 1200-b/s data sets 
is also poorer for the longer connections. 

Figures 59, 60, and 61 show the bit error rate performance of the 
two 4800-b/s data sets for short, medium, and long connections.* As 


* All error rate data presented here were taken with continuous carrier mode data 
set operation. Continuous carrier mode is the normal operation mode for the 1200-b/s 
full-duplex data sets. For the 4800-b/s half-duplex data sets, however, switched carrier 
mode is the typical mode of operation. In general, error performance with switched 
carrier mode operation is poorer than that with continuous carrier mode operation. 
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Fig. 51—CDF of dropout counts. 
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Fig. 52—CDFs of bit error rates for two 1200-b/s data sets for short connections. 
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Fig. 53—CDFs of bit error rates for two 1200-b/s data sets for medium connections. 
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Fig. 54—CDFs of bit error rates for two 1200-b/s data sets for long connections. 
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Fig. 55—CDFs for bit error rates for low and high bands for two 1200-b/s data sets 
for long connections. 
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Fig. 56—CDFs of block error rates for two 1200-b/s data sets for short connections. 
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Fig. 57—CDFs of block error rates for two 1200-b/s data sets for medium connections. 
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Fig. 58--CDI's of block error rates for two 1200-b/s data sets for long connections. 
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Fig. 59—CDFs of bit error rates for two 4800-b/s data sets for short connections. 
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Fig. 6(0—CDFs of bit error rates for two 4800-b/s data sets for medium connections. 
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Fig. 61—CDFs of bit error rates for two 4800-b/s data sets for long connections. 


we can see from the comparison of these figures and Figs. 52, 53, and 
54, the bit error rates for the 1200-, and 4800-b/s data sets are not 
markedly different. This could be expected, considering that the much 
more expensive 4800-b/s data sets used in the EOCS had adaptive 
equalizers, and that the 1200-b/s, full-duplex data transmission is 
really two independent 1200-b/s data transmissions on the same 
connection. 

The block error rate performance of one of the 4800-b/s data sets 
for short, medium, and long connections is shown in Fig. 62, once 
again demonstrating the poorer performance for long connections. 

Figure 63 is a scatter plot of bit error rate versus block error rate 
for both of the 1200-b/s data sets, with a small amount of dither added 
in the horizontal axis to show where multiple points occur. As we can 
see from the line where errors occur in only one of the one thousand 
blocks (0.001), there is a preponderance of bit error counts of three, 
six, nine, and twelve. Figure 64 is a similar plot for one of the 4800-b/s 
data sets, which shows a tendency for an even number of error counts. 

The figures that compare the two 1200-b/s data sets (or the two 
4800-b/s data sets) show that there are differences in their perform- 
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Fig. 62—CDFs for block error rate for a 4800-b/s data set for short, medium, and 
long connections. 
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Fig. 63—Block error rate versus bit error rate for. two 1200-b/s data sets for one 
thousand 1000-bit blocks. 
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Fig. 64—Block error rate versus bit error rate for a 4800-b/s data set for one thousand 
1000-bit blocks. 


ance. Prediction of the performance of other types of data sets by 
extrapolation from these results is not warranted. 


3.14 Cutoffs 


The only nontransmission parameter estimated from the EOCS 
measurements is the call cutoff rate. A cutoff is a connection dropped 
prematurely; it can occur primarily because of failure in the commu- 
nications path caused by transmission or switching problems. Failures 
resulting in carrier group alarms on T-carrier systems and talk off of 
in-band channel signaling are sources of transmission-caused cutoffs. 
Cutoffs caused by switching systems are the result of hardware failures, 
software failures (e.g., in digital switches), and procedural errors. The 
cutoff rate can be measured by examining a sample of representative 
connections through the network and observing the number of calls 
disconnected prematurely. 

Four thousand toll connections were checked in the EOCS sample 
for possible cutoffs during the 15-minute intervals of impulse noise 
measurements discussed in Section 3.10. A cutoff was tallied if a call 
was disconnected at any time during the 15 minutes, given that the 
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call was up at the beginning of the same 15-minute interval. Twenty- 
two cutoffs occurred during the 60,000 call-minute sample (4000 x 
15), leading to a nonweighted average cutoff rate of 2.2 x 10°? fora 
six-minute holding time. The 90-percent confidence interval was cal- 
culated to be between 1.4 X 107° and 3.0 x 107°. 
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